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TRANSPORTERS AND ION CHANNELS 

TECHNICAL FIELD 

This invention relates to nucleic acid and amino acid sequences of transporters and ion 
5 channels and to the use of these sequences in the diagnosis, treatment, and prevention of transport, 
neurological, muscle, immunological and cell proliferative disorders, and in the assessment of the 
effects of exogenous compounds on the expression of nucleic acid and amino acid sequences of 
transporters and ion channels. 

10 BACKGROUND OF THE INVENTION 

Eukaryotic cells are surrounded and subdivided into functionally distinct organelles by 
hydrophobic lipid bilayer membranes which are highly inqxsrmeable to most polar molecules. Cells 
and organelles require transport proteins to import and export essential nutrients and metal ions 
including K + , NH,*, P if SO^, sugars, and vitamins, as well as various metabolic waste products. 

15 Transport proteins also play roles in antibiotic resistance, toxin secretion, ion balance, synaptic 
neurotransmission, kidney function, intestinal absorption, tumor growth, and other diverse cell 
functions (Griffith, J. and C. Sansom (1998) The Transport er Facts Book, Academic Press, San Diego 
CA, pp. 3-29). Transport can occur by a passive concentration-dependent mechanism, or can be 
linked to an energy source such as ATP hydrolysis or an ion gradient Proteins that function in 

20 transport include carrier proteins, which bind to a specific solute and undergo a conformational 
change that translocates the bound solute across the membrane, and channel proteins, which form 
hydrophilic pores that allow specific solutes to diffuse through the membrane down an 
electrochemical solute gradient. 

Carrier proteins which transport a single solute from one side of the membrane to the other 

25 are called uniporters. In contrast, coupled transporters link the transfer of one solute with 

simultaneous or sequential transfer of a second solute, either in the same direction (symport) or in the 
opposite direction (antiport). For example, intestinal and kidney epithelium contains a variety of 
symporter systems driven by the sodium gradient that exists across the plasma membrane. Sodium 
moves into the cell down its electrochemical gradient and brings the solute into the cell with it The 

30 . sodium gradient that provides the driving force for solute uptake is maintained by the ubiquitous 
Na + /K + ATPase system. Sodium-coupled transporters include the mammalian glucose transporter 
(SGLT1), iodide transporter (NIS), and multivitamin transporter (SMVT). All three transporters have 
twelve putative transmembrane segments, extracellular glycosylation sites, and cytoplasmically- 
oriented N- and C-termini. NIS plays a crucial role in the evaluation, diagnosis, and treatment of 

35 various thyroid pathologies because it is the molecular basis for radioiodide thyroid-imaging 
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techniques and for specific targeting of radioisotopes to the thyroid gland (Levy, O. et al. (1997) 
Proc. Natl. Acad. Sci. USA 94:5568-5573). SMVT is expressed in the intestinal mucosa, kidney, and 
placenta, and is implicated in the transport of the water-soluble vitamins, e.g., biotin and pantothenate 
(Prasad, PJX et al. (1998) J. Biol. Chem. 273:7501-7506). 

5 One of the largest famines of transporters is the major facilitator superfamily (MFS), also 

called the uniporter-symporter-antiporter family. MFS transporters are single polypeptide carriers 
that transport small solutes in response to ion gradients. Members of the MFS are found in all classes 
of living organisms, and include transporters for sugars, oligosaccharides, phosphates, nitrates, 
nucleosides, monocarboxylates, and drugs. MFS transporters found in eukaryotes all have a structure 

10 comprising 12 transmembrane segments (Pao, S.S. et al. (1998) Microbiol. Molec. Biol. Rev. 62:1- 
34). The largest family of MFS transporters is the sugar transporter family, which includes the seven 
glucose transporters (GLUT1 -GLUT7) found in humans that are required for the transport of glucose 
and other hexose sugars. These glucose transport proteins have unique.tissue distributions and 
physiological functions. GLUT1 provides many cell types with their basal glucose requirements and 

15 transports glucose across epithelial and endothelial barrier tissues; GLUT2 facilitates glucose uptake 
or efflux from the liver, GLUT3 regulates glucose supply to neurons; GLUT4 is responsible for 
insulin-regulated glucose disposal; and GLUT5 regulates fructose uptake into skeletal muscle. 
Defects in glucose transporters are involved in a recently identified neurological syndrome causing 
infantile seizures and developmental delay, as well as glycogen storage disease, Fanconi-Bickel 

20 syndrome, and non-insulin-dependent diabetes mellitus (Mueckler, M (1994) Eur. J. Biocbem. 
219:713-725; Longo, N. and L.J. Elsas (1998) Adv. Pediatr. 45:293-313). 

Monocarboxylate anion transporters are proton-coupled symporters with a broad substrate 
specificity that includes L-lactate, pyruvate, and the ketone bodies acetate, acetoacetate, and 
beta-hydroxybutyrate. At least seven isoforms have been identified to date. The isoforms are 

25 predicted to have twelve transmembrane (TM) helical domains with a large intracellular loop between 
TM6 and TM7, and play a critical role in maintaining intracellular pH by removing the protons that 
are produced stoichiometrically with lactate during glycolysis. The best characterized 
H + -monocarboxylate transporter is that of the erythrocyte membrane, which transports L-lactate and a 
wide range of other aliphatic monocarboxylates. Other cells possess H + -linked monocarboxylate 

30 transporters with differing substrate and inhibitor selectivities. In particular, cardiac muscle and 
tumor cells have transporters that differ in their K,, values for certain substrates, including 
stereoselectivity for L- over D-lactate, and in their sensitivity to inhibitors. There are 
Na + -monocarboxylate cotransporters on the luminal surface of intestinal and kidney epithelia, which 
allow the uptake of lactate, pyruvate, and ketone bodies in these tissues, hi addition, there are 

35 specific and selective transporters for organic cations and organic anions in organs including the 
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kidney, intestine and liver. Organic anion transporters are selective for hydrophobic, charged 
molecules with electron-attracting side groups. Organic cation transporters, such as the ammonium 
transporter, mediate the secretion of a variety bf drugs and endogenous metabolites, and contribute to 
the maintenance of intercellular pH (Poole, R.C. and A.P. Halestrap (1993) Am J. Physiol. 

5 264:C761-C782; Price, N.T. et aL (1998) Biochem. J. 329:321-328; and Martinelle, K. and I 
Haggstrom (1993) J. Biotechnol. 30:339-350). 

ATP-binding cassette (ABC) transporters are members of a superf amily of membrane 
proteins that transport substances ranging from small molecules such as ions, sugars, amino acids, 
peptides, and phospholipids, to lipopeptides, large proteins, and conq>lex hydrophobic drugs. ABC 

10 transporters consist of four modules: two nucleotide-binding domains (NBD), which hydrolyze ATP 
to supply the energy required for transport, and two membrane-spanning domains (MSD), each 
containing six putative transmembrane segments. These four modules may be encoded by a single 
gene, as is the case for the cystic fibrosis transmembrane regulator (CFTR), or by separate genes. 
When encoded by separate genes, each gene product contains a single NBD and MSD. These "half- 

15 molecules" form homo- and heterodimers, such as Tapl and Tap2, the endoplasmic reticulum-based 
major histocompatibility (MHC) peptide transport system. Several genetic diseases are attributed to 
defects in ABC transporters, such as the following diseases and their corresponding proteins: cystic 
fibrosis (CFTR, an ion channel), adrenoleukodystrophy (adrenoleukodystrophy protein, ALDP), 
Zellweger syndrome (peroxisomal membrane protein-70, PMP70), and hyperinsulinemic 

20 hypoglycemia (sulfonylurea receptor, SUR). Overexpression of the multidrug resistance (MDR) 
protein, another ABC transporter, in human cancer cells makes the cells resistant to a variety of 
cytotoxic drugs used in chemotherapy (Taglicht, D. and S. Michaelis (1998) Methu Enzymol. 
292:130-162). 

A number of metal ions such as iron, zinc, copper, cobalt, manganese, molybdenum, 
25 selenium, nickel, and chromium are important as cofactors for a number of enzymes. For example, 
copper is involved in hemoglobin synthesis, connective tissue metabolism, and bone development, by 
acting as a cofactor in oxidoxeductases such as superoxide dismutase, ferroxidase (ceruloplasmin), 
and lysyl oxidase. Copper and other metal ions must be provided in the diet, and are absorbed by 
transporters in the gastrointestinal tract Plasma proteins transport the metal ions to the liver and 
30 other target organs, where specific transporters move the ions into cells and cellular organelles as 
needed. Imbalances in metal ion metabolism have been associated with a number of disease states 
(Danks, DJM. (1986) J. Med. Genet 23:99-106). 

Transport of fatty acids across the plasma membrane can occur by diffusion, a high capacity, 
low affinity process. However, under normal physiological conditions a significant fraction of fatty 
35 acid transport appears to occur via a high affinity, low capacity protein-mediated transport process. 
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Fatty acid transport protein (FATP), an integral membrane protein with four transmembrane 
segments, is expressed in tissues exhibiting high levels of plasma membrane fatty acid flux, such as 
muscle, heart, and adipose. Expression of FATP is upregulated in 3T3-L1 cells during adipose 
conversion, and expression in COS7 fibroblasts elevates uptake of long-chain fatty acids (Hui, T.Y. et 
5 al. (1998) J. Biol Chem. 273:27420-27429). 

The lipocalin superfamily constitutes a phylogenetically conserved group of more than forty 
proteins that function as extracellular ligand-binding proteins which bind and transport small 
hydrophobic molecules. Members of this family function as carriers of retinoids, odorants, 
chromophobes, pheromones, allergens, and sterols, and in a variety of processes including nutrient 

10 transport, cell growth regulation, immune response, and prostaglandin synthesis. A subset of these 
proteins may be multifunctional, serving as either a biosynthetic enzyme or as a specific enzyme 
inhibitor. (Tanaka, T. et al. (1997) J. Biol. Chem. 272:15789-15795; and van't Hof, W. et al. (1997) 
J. Biol. Oiem. 272:1837-1841.) 

Members of the lipocalin family display unusually low levels of overall sequence 

15 conservation. Pairwise sequence identity often falls below 20%. Sequence similarity between family 
members is limited to conserved cysteines which form disulfide bonds and three motifs which form a 
juxtaposed cluster that functions as a target cell recognition site. The lipocalins share an eight 
stranded, anti-parallel beta-sheet which folds back on itself to form a continuously hydrogen-bonded 
beta-barrel. The pocket formed by the barrel functions as an internal ligand binding site. Seven loops 

20 (LI to L7) form short beta-hairpins, except loop LI which is a large omega loop that forms a lid to 
partially close the internal ligand-binding site (Flower (1996) Biochem. J. 318: 1-14). 

Lipocalins are important transport molecules. Each lipocalin associates with a particular 
ligand and delivers that ligand to appropriate target sites within the organism. Retinol-binding 
protein (RBP), one of the best characterized lipocalins, transports retinol from stores within the liver 

25 to target tissues. Apolipoprotein D (apo D), a component of high density lipoproteins (HDLs) and 
low density lipoproteins (LDLs), functions in the targeted collection and delivery of cholesterol 
throughout the body. Lipocalins are also involved in cell regulatory processes. Apo D, which is 
identical to gross-cystic-disease-fluid protein (GCDFP)-24, is aprogesterone/pregnenolone-binding 
protein expressed at high levels in breast cyst fluid. Secretion of apo D in certain human breast 

30 cancer cell lines is accompanied by reduced cell proliferation and progression of cells to a more 
differentiated phenotype. Similarly, apo D and another lipocalin, a r acid glycoprotein (AGP), are 
involved in nerve cell regeneration. AGP is also involved in anti-inflammatory and 
immunosuppressive activities. AGP is one of the positive acute-phase proteins (APP); circulating 
levels of AGP increase in response to stress and inflammatory stimulation. AGP accumulates at sites 

35 of inflammation where it inhibits platelet and neutrophil activation and inhibits phagocytosis. The 
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immunomodulatory properties of AGP are due to glycosylation. AGP is 40% carbohydrate, making it 
unusually acidic and soluble. The glycosylation patten of AGP changes during acute-phase 
response, and deglycosylated AGP has no immunosuppressive activity (Flower (1994) FEBS Lett 
354:7-11; Flower (1996) supra) . 
5 The lipocalin superfamily also includes several animal allergens, including the mouse major 

urinary protein (mMUP), the rat a-2-microgloobulin (rA2U), the bovine p-lactoglobulin (pig), the 
cockroach allergen (Bla g4), bovine dander allergen (Bos d2), and the major horse allergen, 
designated Equus caballus allergen 1 (Equ cl). Equ cl is a powerful allergen responsible for about 
80% of anti-horse IgE antibody response in patients who are chronically exposed to horse allergens. 

10 It appears that lipocalins may contain a common structure that is able to induce the IgE response 
(Gregoire, C et al., (1996) J. Biol. Chem. 271:32951-32959). 

Lipocalins are used as diagnostic and prognostic markers in a variety of disease states. The 
plasma level of AGP is monitored during pregnancy and in diagnosis and prognosis of conditions 
including cancer chemotherapy, renal disfunction, myocardial infarction, arthritis, and multiple 

15 sclerosis. RBP is used clinically as a marker of tubular reabsorption in the kidney, and apo D is a 
marker in gross cystic breast disease (Flower (1996) supra) . Additionally, the use of lipocalin animal 
allergens may help in the diagnosis of allergic reactions to horses (Gregoire supra) , pigs, cockroaches, 
mice and rats. 

Mitochondrial carrier proteins are transmemhrane-spanning proteins which transport ions and 

20 charged metabolites between the cytosol and the mitochondrial matrix. Examples include the ADP, 
ATP carrier protein; the 2-oxoglutarate/malate carrier; the phosphate carrier protein; the pyruvate 
carrier; the dicarboxylate carrier which transports malate, succinate, fumarate, and phosphate; the 
tricarboxylate carrier which transports citrate and malate; and the Grave's disease carrier protein, a 
protein recognized by IgG in patients with active Grave's disease, an autoimmune disorder resulting 

25 in hyperthyroidism. Proteins in this family consist of three tandem repeats of an approximately 100 
amino acid domain, each of which contains two transmembrane regions (Stryer, L. (1995) 
Biochemistry, W.H. Freeman and Company, New York NY, p. 551; PROSITE PDOC00189 
Mitochondrial energy transfer proteins signature; Online Mendelian Inheritance in Man (OMIM) 
*275000 Graves Disease). 

30 This class of transporters also includes the mitochondrial uncoupling proteins, which create 

proton leaks across the inner mitochondrial membrane, thus uncoupling oxidative phosphorylation 
from ATP synthesis. The result is energy dissipation in the form of heat Mitochondrial uncoupling 
proteins have been implicated as modulators of thermoregulation and metabolic rate, and have been 
proposed as potential targets for drugs against metabolic diseases such as obesity (Ricquier, D. et al. 

35 (1999) J. Int Med. 245:637-642). 
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I n Channels 

The electrical potential of a cell is generated and maintained by controlling the movement of 
ions across the plasma membrane. The movement of ions requires ion channels, which form ion- 
selective pores within the membrane. There are two basic types of ion channels, ion transporters and 
5 gated ion channels. Ion transporters utilize the energy obtained from ATP hydrolysis to actively 
transport an ion against the ion's concentration gradient Gated ion channels allow passive flow of an 
ion down the ion's electrochemical gradient under restricted conditions. Together, these types of ion 
channels generate, maintain, and utilize an electrochemical gradient that is used in 1) electrical 
impulse conduction down the axon of a nerve cell, 2) transport of molecules into cells against 
10 concentration gradients, 3) initiation of muscle contraction, and 4) endocrine cell secretion. 
Ion Transporters 

Ion transporters generate and jnaintflin the resting electrical potential of a cell. Utilizing the 
energy derived from ATP hydrolysis, they transport ions against the ion' s concentration gradient 
These transmembrane ATPases are divided into three families. The phosphorylated (P) class ion 

15 transporters, including Na + -K + ATPase, Ca 2+ -ATPase, and H + -ATPase, are activated by a 

phosphorylation event. P-class ion transporters are responsible for inaintaining resting potential 
distributions such that cytosolic concentrations of Na + and Ca 2 * are low and cytosolic concentration 
of K + is high. The vacuolar (V) class of ion transporters includes H + pumps on intracellular 
organelles, such as lysosomes and Golgi. V-class ion transporters are responsible for generating the 

20 low pH within the lumen of these organelles that is required for function. The coupling factor (F) 
class consists of H + pumps in the mitochondria. F-class ion transporters utilize a proton gradient to 
generate ATP from ADP and inorganic phosphate (PJ. 

The P-ATPases are hexamers of a 100 kD subunit with ten transmembrane domains and 
several large cytoplasmic regions that may play a role in ion binding (Scarborough, G.A. (1999) Curr. 

25 Opin. Cell Biol. 11:517-522). The V- ATPases are composed of two functional domains: the V\ 
domain, a peripheral complex responsible for ATP hydrolysis; and the V 0 domain, an integral 
complex responsible for proton translocation across the membrane. The F- ATPases are structurally 
and evolutionarily related to the V-ATPases. The F-ATPase F 0 domain contains 12 copies of the c 
subunit, a highly hydrophobic protein composed of two transmembrane domains and containing a 

30 single buried carboxyl group in TM2 that is essential for proton transport The V-ATPase V 0 domain 
contains three types of homologous c subunits with four or five transmembrane domains and the 
essential carboxyl group in TM4 or TM3. Both types of complex also contain a single a subunit that 
may be involved in regulating the pH dependence of activity (Forgac, M. (1999) J. Biol. Chem. 
274:12951-12954). 

35 The resting potential of the cell is utilized in many processes involving carrier proteins and 
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gated ion channels. Carrier proteins utilize the resting potential to transport molecules into and out of 
the cell. Amino acid and glucose transport into many cells is linked to sodium ion co-transport 
(symport) so that the movement of Na + down an electrochemical gradient drives transport of the other 
molecule up a concentration gradient Similarly, cardiac muscle links transfer of Ca 2 * out of the cell 
5 with transport of Na + into the cell (antiport). 
Gated Ion Channels 

Gated ion channels control ion flow by regulating the opening and closing of pores. The 
ability to control ion flux through various gating mechanisms allows ion channels to mediate such 
diverse signaling and homeostatic functions as neuronal and endocrine signaling, muscle contraction, 

10 fertilization, and regulation of ion and pH balance. Gated ion channels are categorized according to 
the manner of regulating the gating function. Mechanically-gated channels open their pores in 
response to mechanical stress; voltage-gated channels (e.g., Na\ K + , Ca 2 *, and CI" channels) open 
their pares in response to changes in membrane potential; and ligand-gated channels (e.g., 
acetylcholine-, serotonin-, and glutamate-gated cation channels, and GAB A- and gly cine-gated 

15 chloride channels) open their pores in the presence of a specific ion, nucleotide, or neurotransmitter. 
The gating properties of a particular ion channel (i.e., its threshold for and duration of opening and 
closing) are sometimes modulated by association with auxiliary channel proteins and/or post 
translational modifications, such as phosphorylation. 

Mechanically-gated or mechanosensitive ion channels act as transducers for the senses of 

20 touch, hearing, and balance, and also play important roles in cell volume regulation, smooth muscle 
contraction, and cardiac rhythm generation. A stretch-inactivated channel (SIC) was recently cloned 
from rat kidney. The SIC channel belongs to a group of channels which are activated by pressure or 
stress on the cell membrane and conduct both Ca 2+ and Na + (Suzuki, M. et al. (1999) J. Biol. Chenx 
274:6330-6335). 

25 The pore-forming subunits of the voltage-gated cation channels form a superfamily of ion 

channel proteins. The characteristic domain of these channel proteins comprises six transmembrane 
domains (S1-S6), a pore-forming region (P) located between S5 and S6, and intracellular amino and 
carboxy termini. la the Na + and Ca 2 * subfamilies, this domain is repeated four times, while in the K + 
channel subfamily, each channel is formed from a tetramer of either identical or dissimilar subunits. 

30 The P region contains information specifying the ion selectivity for the channel. In the case of K + 
channels, a GYG tripeptide is involved in this selectivity (Ishii, T.M. et al. (1997) Proc. Natl. Acad. 
Sci. USA 94:11651-11656). 

Voltage-gated Na + and K + channels are necessary for the function of electrically excitable 
cells, such as nerve and muscle cells. Action potentials, which lead to neurotransmitter release and 

35 muscle contraction, arise from large, transient changes in the permeability of the membrane to Na + 
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and K + ions. Depolarization of the membrane beyond the threshold level opens voltage-gated Na + 
channels. Sodium ions flow into the cell, further depolarizing the membrane and opening more 
voltage-gated Na + channels, which propagates the depolarization down the length of the cell. 
Depolarization also opens voltage-gated potassium channels. Consequently, potassium ions flow 
5 outward, which leads to repolarization of the membrane. Voltage-gated channels utilize charged 
residues in the fourth transmembrane segment (S4) to sense voltage change. The open state lasts only 
about 1 millisecond, at which time the channel spontaneously converts into an inactive state that 
cannot be opened irrespective of the membrane potential. Inactivation is mediated by the channel's 
N-terminus, which acts as a plug that closes the pore. The transition from an inactive to a closed state 

10 requires a return to resting potential. 

Voltage-gated Na + channels are heterotrimeric complexes composed of a 260 kDa pore- 
forming a subunit that associates with two smaller auxiliary subimits, 01 and 02. The 02 subunit is a 
integral membrane glycoprotein that contains an extracellular Ig domain, and its association with a 
and 01 subunits correlates with increased functional expression of the channel, a change in its gating 

15 properties, as well as an increase in whole cell capacitance due to an increase in membrane surface 
area (bom, L.L. et al. (1995) Cell 83:433-442). 

Non voltage-gated Na + channels include the members of the amiloride-sensitive Na + 
channel/degenerin (NaC/DEG) family. Channel subunits of this family are thought to consist of two 
transmembrane domains flanking a long extracellular loop, with the amino and carboxyl termini 

20 located within the cell. The NaC/DEG family includes the epithelial Na + channel (ENaC) involved in 
Na + rcabsorption in epithelia including the airway, distal colon, cortical collecting duct of the kidney, 
and exocrine duct glands. Mutations in ENaC result in pseudohypoaldosteronism type 1 and Liddle's 
syndrome (pseudohyperaldosteronism). The NaC/DEG family also includes the recently 
characterized IP-gated cation channels or acid-sensing ion channels (ASIC). ASIC subunits are 

25 expressed in the brain and form heteromultimeric Na + -permeable channels. These channels require 
acid pH fluctuations for activation. ASIC subunits show homology to the degenerins, a family of 
mechanically-gated channels originally isolated from C elegans. Mutations in the degenerins cause 
neurodegeneration. ASIC subunits may also have a role in neuronal function, or in pain perception, 
since tissue acidosis causes pain (Waldmann, R. and M. Lazdunski (1998) Curr. Opin. Neurobiol. 

30 8:41&424; Eglen, RJM. et al. (1999) Trends Pharmacol. ScL 20:337-342). 

K + channels are located in all cell types, and may be regulated by voltage, ATP 
concentration, or second messengers such as Ca 2+ and cAMP. In non-excitable tissue, K + channels 
are involved in protein synthesis, control of endocrine secretions, and the maintenance of osmotic 
equilibrium across membranes. In neurons and other excitable cells, in addition to regulating action 

35 potentials and repolarizing membranes, K + channels are responsible for setting the resting membrane 
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potential. The cytosol contains non-diffusible anions and, to balance this net negative charge, the cell 
contains a Na + -K + pump and ion channels that provide the redistribution of Na + , K\ and CI". The 
pump actively transports Na + out of the cell and K + into the cell ina3:2ratio. Ion channels in the 
plasma membrane allow K + and d ' to flow by passive diffusion. Because of the high negative charge 
5 within the cytosol, CI ' flows out of the cell. The flow of K + is balanced by an electromotive force 
pulling K + into the cell, and a K + concentration gradient pushing K + out of the cell. Thus, the resting 
membrane potential is primarily regulated by K + flow (Salkoff, L. and T. Jegla (1995) Neuron 15:489- 
492). 

Potassium channel submits of the Shaker- like superfamily all have the characteristic six 
10 transmembrane/1 pore domain structure. Four subunits combine as homo- or heterotetramers to form 
functional K channels. These pore-forming subunits also associate with various cytoplasmic P 
subunits that alter channel inactivation kinetics. The Shaker- like channel family includes the voltage- 
gated K* channels as well as the delayed rectifier type channels such as the human ether-a-go-go 
related gene (HERG) associated with long QT, a cardiac dysrythmia syndrome (Curran, M.E. (1998) 
15 Curr. Opin. BiotechnoL 9:565-572; KaczorowsM, GJ. and MJL. Garcia (1999) Curr. Opin. Chem. 
Biol. 3:448-458). 

A second superfamily of K + channels is composed of the inward rectifying channels (Kir). 
Kir channels have the property of preferentially conducting K + currents in the inward direction. 
These proteins consist of a single potassium selective pore domain and two transmembrane domains , 

20 which correspond to the fifth and sixth transmembrane domains of voltage-gated K + channels. Kir 
subunits also associate as tetramers. The Kir family includes ROMK1, mutations in which lead to 
Bartter syndrome, a renal tubular disorder. Kir channels are also involved in regulation of cardiac 
pacemaker activity, seizures and epilepsy, and insulin regulation (Doupnik, CA. et al. (1995) Curr. 
Opin. Neurobiol. 5:268-277; Curran, supra) . 

25 The recently recognized TWIK K + channel family includes the mammalian TWIK-1, TREK- 

1 and TASK proteins. Members of this family possess an overall structure with four transmembrane 
domains and two P domains. These proteins are probably involved in controlling the resting potential 
in a large set of cell types (Duprat, F. et al. (1997) EMBO J 16:5464-5471). 

The voltage-gated Ca 2+ channels have been classified into several subtypes based upon their 

30 electrophysiological and pharmacological characteristics. L-type Ca 2+ channels are predominantly 
expressed in heart and skeletal muscle where they play an essential role in excitation-contraction 
coupling. T-type channels are important for cardiac pacemaker activity, while N-type and P/Q-type 
channels are involved in the control of neurotransmitter release in the central and peripheral nervous 
system. The L-type and N-type voltage-gated Ca 2+ channels have been purified and, though their 

35 functions differ dramatically, they have similar subunit compositions. The channels are composed of 
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three subunits. The aj subunit forms the membrane pore and voltage sensor, while the otjS and P 
subunits modulate the voltage-dependence, gating properties, and the current amplitude of the 
channel. These subunits are encoded by at least six a,, one c^S, and four P genes. A fourth subunit, y, 
has been identified in skeletal muscle (Walker, D. et al. (1998) J. Biol. Chem. 273:2361-2367; 
5 McCLeskey, E.W. (1994) Cuir. Opin. Neurobiol. 4:304-312). 

The high-voltage-activated Ca 2+ channels that have been characterized biochemically include 
complexes of a pore-forming alphal subunit of approximately 190-250 kDa; a transmembrane 
complex of alpha2 and delta subunits; an intracellular beta subunit; and in some cases a 
transmembrane gamma subunit A variety of alphal subunits, alpha2delta complexes, beta subunits, 

10 and gamma subunits are known. The Cavl family of alphal subunits conduct L-type Ca ^ currents, 
which initiate muscle contraction, endocrine secretion, and gene transcription, and are regulated 
primarily by second messenger-activated protein phosphorylation pathways. The Cav2 family of 
alphal subunits conduct N-type, P/Q-type, and R-type Ca 2+ currents, which initiate rapid synaptic 
transmission and are regulated primarily by direct interaction with G proteins and SNARE proteins 

15 and secondarily by protein phosphorylation. The Cav3 family of alphal subunits conduct T-type Ca 
2+ currents, which are activated and inactivated more rapidly and at more negative membrane 
potentials than other Ca 2+ current types. The distinct structures and patterns of regulation of these 
three families of Ca 2+ channels provide an array of Ca 2 * entry pathways in response to changes in 
membrane potential and a range of possibilities for regulation of Ca ^ entry by second messenger 

20 pathways and interacting proteins (Catterall, W.A. (2000) Annu. Rev. Cell Dev. Biol. 16:521-555). 
The alpha-2 subunit of the voltage-gated Ca ^-channel may include one or more Cache 
domains. An extracellular Cache domain may be fused to an intracellular catalytic domain, such as 
the histidine kinase, PP2C phosphatase, GGDEF (a predicted diguanylate cyclase), HD-GYP (a 
predicted phosphodiesterase) or adenylyl cyclase domain, or to a noncatalytic domain, like the 

25 methyl-accepting, DNA-binding winged helix-turn-helix, GAF, PAS or HAMP (a domain found in 
istidine kinases, denylyl cyclases, ethyl-binding proteins and phosphatases). Small molecules are 
bound via the Cache domain and this signal is converted into diverse outputs depending on the 
intracellular domains (Anantharaman, V. and Aravind, L.(2000) Trends Biochem. Sci. 25:535-537). 
The transient receptor family (Trp) of calcium ion channels are thought to mediate 

30 capacitative calcium entry (CCE). CCE is the Ca 2 * influx into cells to resupply Ca 2+ stores depleted 
by the action of inositol triphosphate (IP3) and other agents in response to numerous hormones and 
growth factors. Trp and Trp-Uke were first cloned from Drosophila and have similarity to voltage 
gated Ca 2+ channels in the S3 through S6 regions. This suggests that Trp and/or related proteins may 
form mammalian CCE channels (Zhu, X. et al. (1996) Cell 85:661-671; Boulay, G. et al. (1997) J. 

35 Biol. Chem 272:29672-29680). Melastatin is a gene isolated in both the mouse and human, whose 
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expression in melanoma cells is inversely correlated with melanoma aggressiveness in vivo . The 
human cDNA transcript corresponds to a 1533-amino acid protein having homology to members of 
the Trp family. It has been proposed that the combined use of malastatin mRNA expression status 
and tumor thickness might allow for the determination of subgroups of patients at both low and high 
5 risk for developing metastatic disease (Duncan, L.M. et al (2001) J. Clin. QncoL 19:568-576). 

Chloride channels are necessary in endocrine secretion and in regulation of cytosolic and 
organelle pH. In secretory epithelial cells, G~ enters the cell across a basolateral membrane through 
an Na + , K + /C1 " cotransporter, accumulating in the cell above its electrochemical equilibrium 
concentration. Secretion of CI ' from the apical surface, in response to hormonal stimulation, leads to 

10 flow of Na + and water into the secretory lumen. The cystic fibrosis transmembrane conductance 
regulator (CFTR) is a chloride channel encoded by the gene for cystic fibrosis, a common fatal 
genetic disorder in humans. CFTR is a member of the ABC transporter family, and is composed of . 
two domains each consisting of six transmembrane domains followed by a nucleotide-binding site. 
Loss of CFTR function decreases transepithelial water secretion and, as a result, the layers of mucus 

15 that coat the respiratory tree, pancreatic ducts, and intestine are dehydrated and difficult to clear. The 
resulting blockage of these sites leads to pancreatic insufficiency, "meconium ileus", and devastating 
"chronic obstructive pulmonary disease" (Al-Awqati, Q. et al. (1992) J. Exp. Biol. 172:245-266). 

The voltage-gated chloride channels (CLC) are characterized by 10-12 transmembrane 
domains, as well as two small globular domains known as CBS domains. The CLC subunits 

20 probably function as homotetramers. CLC proteins are involved in regulation of cell volume, 
membrane potential stabilization, signal transduction, and transepithelial transport Mutations in 
CLC-1 , expressed predominantly in skeletal muscle, are responsible for autosomal recessive 
generalized myotonia and autosomal dominant myotonia congenita, while mutations in the kidney 
channel CLC-5 lead to kidney stones (Jentsch, TJ. (1996) Curr. Opin. Neurobiol. 6:303-310). 

25 Ligand-gated channels open their pores when an extracellular or intracellular mediator binds 

to the channel. Neurotransmitter-gated channels are channels that open when a neurotransmitter 
binds to their extracellular domain. These channels exist in the postsynaptic membrane of nerve or 
muscle cells. There are two types of neurotransmitter-gated channels. Sodium channels open in 
response to excitatory neurotransmitters, such as acetylcholine, glutamate, and serotonin. This 

30 opening causes an influx of Na + and produces the initial localized depolarization that activates the 
voltage-gated channels and starts the action potential. Chloride channels open in response to 
inhibitory neurotransmitters, such as y-aminobutyric acid (GABA) and glycine, leading to 
hyperpolarization of the membrane and the subsequent generation of an action potential. 
Neurotransmitter-gated ion channels have four transmembrane domains and probably function as 

35 pentamers (Jentsch, supra) . Amino acids in the second transmembrane domain appear to be important 
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in determining channel permeation and selectivity (Sather, W.A. et al. (1994) Curr. Opin. Neurobiol. 
4:313-323). 

Ligand-gated channels can be regulated by intracellular second messengers. For example, 
calcium-activated K + channels are gated by internal calcium ions. In nerve cells, an influx of calcium 
5 during depolarization opens K + channels to modulate the magnitude of the action potential (Ishi et al., 
supra) . The large conductance (BK) channel has been purified from brain and its subunit 
composition determined. The a subunit of the BK channel has seven rather than six transmembrane 
domains in contrast to voltage-gated K + channels. The extra transmembrane domain is located at the 
subunit N-terminus. A 28-amino-acid stretch in the C-terminal region of the subunit (the "calcium 

10 bowl" region) contains many negatively charged residues and is thought to be the region responsible 
for calcium binding. The P subunit consists of two transmembrane domains connected by a 
glycosylated extracellular loop, with intracellular N- and C-termini (Kaczorowski, supra: Vergara, G 
et al. (1998) Curr. Opin. Neurobiol. 8:321-329). 

Cyclic nucleotide-gated (CNG) channels are gated by cytosolic cyclic nucleotides. The best 

15 examples of these are the cAMP-gated Na + channels involved in olfaction and the cGMP-gated cation 
channels involved in vision. Both systems involve ligand-mediated activation of a G-protein coupled 
receptor which then alters the level of cyclic nucleotide within the cell. CNG channels also represent 
a major pathway for Ca 2 * entry into neurons, and play roles in neuronal development and plasticity. 
CNG channels are tetramers containing at least two types of subunits, an a subunit which can form 

20 functional homomeric channels, and a P subunit, which modulates the channel properties. All CNG 
subunits have six transmembrane domains and a pore forming region between the fifth and sixth 
transmembrane domains, similar to voltage-gated K + channels. A large C-terminal domain contains a 
cyclic nucleotide binding domain, while the N-tenninal domain confers variation among channel 
subtypes (Zufall, F. et al. (1997) Curr. Opin. Neurobiol. 7:404-412). 

25 The activity of other types of ion channel proteins may also be modulated by a variety of 

intracellular signalling proteins. Many channels have sites for phosphorylation by one or more 
protein kinases including protein kinase A, protein kinase C, tyrosine kinase, and casein kinase II, all 
of which regulate ion channel activity in cells. Kir channels are activated by the binding of the Gffy 
subunits of heterotrimeric G-proteins (Reimann, F. and F.M. Ashcroft (1999) Curr. Opin. Cell. Biol. 

30 1 1:503-508). Other proteins are involved in the localization of ion channels to specific sites in the 
cell membrane. Such proteins include the PDZ domain proteins known as MAGUKs (membrane- 
associated guanylate kinases) which regulate the clustering of ion channels at neuronal synapses 
(Craven, SB. and D.S. Bredt (1998) Cell 93:495-498). 
Disease Correlati n 

35 The etiology of numerous human diseases and disorders can be attributed to defects in the 
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transport of molecules across membranes. Defects in the trafficking of membrane-bound transporters 
and ion channels are associated with several disorders, e.g., cystic fibrosis, glucose-galactose 
malabsorption syndrome, hypercholesterolemia, von Gierke disease, and certain forms of diabetes 
meEitus. Single-gene defect diseases resulting in an inability to transport small molecules across 
5 membranes include, e.g., cystinuria, iminoglycinuria, Hartup disease, and Fanconi disease (van* Hoff, 
W.G. (1996) Exp. Nephrol. 4:253-262; Talente, G.M. et al. (1994) Ann. Intern. Med. 120:218-226; 
and Chillon, M. et al. (1995) New Engl. J. Med. 332:1475-1480), 

Human diseases caused by mutations in ion channel genes include disorders of skeletal 
muscle, cardiac muscle, and the central nervous system. Mutations in the pore-forming subunits of 

10 sodium and chloride channels cause myotonia, a muscle disorder in which relaxation after voluntary 
contraction is delayed. Sodium channel myotonias have been treated with channel blockers. 
Mutations in muscle sodium and calcium channels cause forms of periodic paralysis, while mutations 
. in the sarcoplasmic calcium release channel, T-tubule calcium channel, and muscle sodium channel 
cause m al i gnant hyperthermia! Cardiac arrythmia disorders such as the long QT syndromes and 

15 idiopathic ventricular fibrillation are caused by mutations in potassium and sodium channels (Cooper, 
E.C. and L.Y. Jan (1998) Proc. Nad. Acad. Sci. USA 96:4759-4766). All four known human 
idiopathic epilepsy genes code for ion channel proteins (Berkovic, S.R and LE. Scheffer (1999) Curr. 
Opin. Neurology 12: 177-182). Other neurological disorders such as ataxias, hemiplegic migraine and 
hereditary deafness can also result from mutations in ion channel genes (Jen, J. (1999) Curr. Opin. 

20 Neurotriol. 9:274-280; Cooper, supra! 

Ion channels have been the target for many drug therapies. Neurotransmitter-gated channels 
have been targeted in therapies for treatment of insomnia, anxiety, depression, and schizophrenia. 
Voltage-gated channels have been targeted in therapies for arrhythmia, ischemic stroke, head trauma, 
and neurodegenerative disease (Taylor, CP. and L.S. Narasimhan (1997) Adv. Pharmacol. 39:47-98). 

25 Various classes of ion channels also play an important role in the perception of pain, and thus are 
potential targets for new analgesics. These include the vanilloid-gated ion channels, which are 
activated by the vanilloid capsaicin, as well as by noxious heat Local anesthetics such as lidocaine 
and mexiletine which blockade voltage-gated Na + channels have been useful in the treatment of 
neuropathic pain (Eglen, supra) . 

30 Ion channels in the immune system have recently been suggested as targets for 

immunomodulation. T-cell activation depends upon calcium signaling, and a diverse set of T-cell 
specific ion channels has been characterized that affect this signaling process. Channel blocking 
agents can inhibit secretion of lymphokines, cell proliferation, and killing of target cells. A peptide 
antagonist of the T-cell potassium channel Kvl.3 was found to suppress delayed-type hypersensitivity 

35 and allogenic responses in pigs, validating the idea of channel blockers as safe and efficacious 
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immunosuppressants (Cahalan, M.D. and K.G. Chandy (1997) Cuir. Opin. Biotechnol. 8:749-756). 

The discovery of new transporters and ion channels, and the polynucleotides encoding them, 
satisfies a need in the art by providing new compositions which are useful in the diagnosis, 
prevention, and treatment of transport, neurological, muscle, immunological and cell proliferative 
5 disorders, and in the assessment of the effects of exogenous compounds on the expression of nucleic 
acid and amino acid sequences of transporters and ion channels. 

SUMMARY OF THE INVENTION 

The invention features purified polypeptides, transporters and ion channels, referred to 

10 collectively as 'TRICfT and individually as "TRICH-1," 'TRICH-2," 'TRICH-3," "TRICH-4," 
'TRICH-5," 'TRICH-6," 'TRICH-7," 'TRICH-8" 'TRICH-9," 'TRICH-IO," 'TRICH-11," 
"TRICH-12," 'TRICH-13," "TRICH-14," 'TRICH-15," 'TRICH-16," * < TRICH-17," 'TRICH-18," 
'TRICH-19," and t< TRICH-20." In one aspect, the invention provides an isolated polypeptide 
selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected 

15 from the group consisting of SEQ ID NO: 1-20, b) a polypeptide comprising a naturally occurring 
amino acid sequence at least 90% identical to an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-20, c) a biologically active fragment of a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-20, and d) an immunogenic 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 

20 ID NO: 1-20. In one alternative, the invention provides an isolated polypeptide comprising the amino 
acid sequence of SEQ JD NO: 1-20. 

The invention further provides an isolated polynucleotide encoding a polypeptide selected 
from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-20, b) a polypeptide comprising a naturally occurring amino acid 

25 sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-20, c) a biologically active fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-20, and d) an immunogenic fragment of a 
polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-20. 
In one alternative, the polynucleotide encodes a polypeptide selected from the group consisting of 

30 SEQ ID NO: 1-20. In another alternative, the polynucleotide is selected from the group consisting of 
SEQIDNO:21-40. 

Additionally, the invention provides a recombinant polynucleotide comprising a promoter 
sequence operably linked to a polynucleotide encoding a polypeptide selected from the group 
consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting 
35 of SEQ ID NO: 1-20, b) a polypeptide comprising a naturally occurring amino acid sequence at least 
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90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-20, c) a 
biologically active fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-20, and d) an immunogenic fragment of a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-20. In one alternative, the 
5 invention provides a cell transformed with the recombinant polynucleotide. In another alternative, the 
invention provides a transgenic organism comprising the recombinant polynucleotide. 

The invention also provides a method for producing a polypeptide selected from the group 
consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting 
of SEQ ID NO: 1-20, b) a polypeptide comprising a naturally occurring amino acid sequence at least 

10 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-20, c) a 
biologically active fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-20, and d) an immunogenic fragment of a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-20. The method con^rises a) 
culturing a cell under conditions suitable for expression of the polypeptide, wherein said cell is 

15 transformed with a recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide encoding the polypeptide, and b) recovering the polypeptide so expressed. 

Additionally, the invention provides an isolated antibody which specifically binds to a 
polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 
sequence selected from the group consisting of SEQ ID NO.1-20, b) a polypeptide coiqprising a 

20 naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-20, c) a biologically active fragment of a polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-20, and d) an 
immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-20. 

25 The invention further provides an isolated polynucleotide selected from the group consisting 

of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of 
SEQ ID NO:21-40, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at 
least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:21-40, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide 

30 complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d). In one alternative, the 
polynucleotide comprises at least 60 contiguous nucleotides. 

Additionally, the invention provides a method for detecting a target polynucleotide in a 
sample, said target polynucleotide having a sequence of a polynucleotide selected from the group 
consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group 

35 consisting of SEQ ID NO:21-40, b) a polynucleotide comprising a naturally occurring polynucleotide 
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sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of 
SEQ ID NO:21-40, c) a polynucleotide complementary to the polynucleotide of a), d) a 
polynucleotide complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d). The 
method comprises a) hybridizing the sample with a probe comprising at least 20 contiguous 
5 nucleotides comprising a sequence complementary to said target polynucleotide in the sample, and 
which probe specifically hybridizes to said target polynucleotide, under conditions whereby a 
hybridization complex is formed between said probe and said target polynucleotide or fragments 
thereof, and b) detecting the presence or absence of said hybridization complex, and optionally, if 
present, the amount thereof. In one alternative, the probe comprises at least 60 contiguous 
10 nucleotides. 

The invention further provides a method for detecting a target polynucleotide in a sample, 
said target polynucleotide having a sequence of a polynucleotide selected from the group consisting 
of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of 
SEQ ID NO:21-40, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at 

15 least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:21-40, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide 
complementary to the polynucleotide of b), and e) an RNA equivalent of a>d). The method 
comprises a) amplifying said target polynucleotide or fragment thereof using polymerase chain 
reaction amplification, and b) detecting the presence or absence of said amplified target 

20 polynucleotide or fragment thereof, and, optionally, if present, the amount thereof. 

The invention further provides a composition comprising an effective amount of a 
polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-20, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 

25 from the group consisting of SEQ ID NO: 1-20, c) a biologically active fragment of a polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-20, and d) an 
immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ H> NO: 1-20, and a pharmaceutical^ acceptable excipient. In one embodiment, the 
composition comprises an amino acid sequence selected from the group consisting of SEQ ID NO: 1- 

30 20. The invention additionally provides a method of treating a disease or condition associated with 
decreased expression of functional TRICH, comprising administering to a patient in need of such 
treatment the composition. 

The invention also provides a method for screening a compound for effectiveness as an 
agonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino 

35 acid sequence selected from the group consisting of SEQ ID NO: 1-20, b) a polypeptide comprising a 
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naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-20, c) a biologically active fragment of a polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-20, and d) an 
immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 
5 consisting of SEQ ID NO: 1-20. The method comprises a) exposing a sample comprising the 

polypeptide to a compound, and b) detecting agonist activity in the sample. In one alternative, the 
invention provides a composition comprising an agonist compound identified by the method and a 
phaimaceutically acceptable excipient. In another alternative, the invention provides a method of 
treating a disease or condition associated with decreased expression of functional TRICH, comprising 

10 administering to a patient in need of such treatment the composition. 

Additionally, the invention provides a method for screening a compound for effectiveness as 
an antagonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-20, b) a polypeptide 
comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid 

15 sequence selected from the group consisting of SEQ ID NO: 1-20, c) a biologically active fragment of 
a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-20, 
and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-20. The method comprises a) exposing a sample comprising the 
polypeptide to a compound, and b) detecting antagonist activity in the sample. In one alternative, the 

20 invention provides a composition comprising an antagonist compound identified by the method and a 
pharmaceutically acceptable excipient In another alternative, the invention provides a method of 
treating a disease or condition associated with overexpression of functional TRICH, comprising 
administering to a patient in need of such treatment the composition. 

The invention further provides a method of screening for a compound that specifically binds 

25 to a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-20, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-20, c) a biologically active fragment of a polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-20, and d) an 

30 immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-20. The method comprises a) combining the polypeptide with at least 
one test compound under suitable conditions, and b) detecting binding of the polypeptide to the test 
compound, thereby identifying a compound that specifically binds to the polypeptide. 

The invention further provides a method of screening for a compound that modulates the 

35 activity of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino 
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acid sequence selected from the group consisting of SEQ ID NO: 1-20, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-20, c) a biologically active fragment of a polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-20, and d) an 
5 immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-20. The method comprises a) combining the polypeptide with at least 
one test compound under conditions permissive for the activity of the polypeptide, b) assessing the 
activity of the polypeptide in the presence of the test compound, and c) comparing the activity of the 
polypeptide in the presence of the test compound with the activity of the polypeptide in the absence 
10 of the test compound, wherein a change in the activity of the polypeptide in the presence of the test 
compound is indicative of a compound that modulates the activity of the polypeptide. 

The invention further provides a method for screening a compound for effectiveness in 
altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:21-40, the method 
15 comprising a) exposing a sample comprising the target polynucleotide to a compound, and b) 
detecting altered expression of the target polynucleotide. 

The invention further provides a method for assessing toxicity of a test compound, said 
method comprising a) treating a biological sample containing nucleic acids with the test compound; 
b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 
contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide 
comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:21-40, ii) a 
polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:21-40, iii) a 
polynucleotide having a sequence complementary to i), iv) a polynucleotide complementary to the 
polynucleotide of ii), and v) an RNA equivalent of i)-iv). Hybridization occurs under conditions 
whereby a specific hybridization complex is formed between said probe and a target polynucleotide 
in the biological sample, said target polynucleotide selected from the group consisting of i) a 
polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:21-40, ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:21-40, 
iii) a polynucleotide complementary to the polynucleotide of i), iv) a polynucleotide complementary 
to the polynucleotide of ii), and v) an RNA equivalent of i>iv). Alternatively, the target 
polynucleotide comprises a fragment of a polynucleotide sequence selected from the group consisting 
of i)-v) above; c) quantifying the amount of hybridization complex; and d) comparing the amount of 
hybridization complex in the treated biological sample with the amount of hybridization complex in 
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an untreated biological sample, wherein a difference in the amount of hybridization complex in the 
treated biological sample is indicative of toxicity of the test compound. 

BRIEF DESCRIPTION OF THE TABLES 
5 Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide 

sequences of the present invention. 

Table 2 shows the GenBank identification number and annotation of the nearest GenBank 
homolog for polypeptides of the invention. The probability scores for the matches between each 
polypeptide and its homolog(s) are also shown. 
10 Table 3 shows structural features of polypeptide sequences of the invention, including 

predicted motifs and domains, along with the methods, algorithms, and searchable databases used for 
analysis of the polypeptides. 

Table 4 lists the cDNA and/or genomic DNA fragments which were used to assemble 
polynucleotide sequences of the invention, along with selected fragments of the polynucleotide 
15 sequences. 

Table 5 shows the representative cDNA library for polynucleotides of the invention. 
Table 6 provides an appendix which describes the tissues and vectors used for construction of 
the cDNA libraries shown in Table 5. 

Table 7 shows the tools, programs, and algorithms used to analyze the polynucleotides and 
20 polypeptides of the invention, along with applicable descriptions, references, and threshold 
parameters. 

DESCRIPTION OF THE INVENTION 

Before the present proteins, nucleotide sequences, and methods are described, it is understood 
25 that this invention is not limited to the particular machines, materials and methods described, as these 
may vary, ft is also to be understood that the terminology used herein is for the purpose of describing 
particular embodiments only, and is not intended to limit the scope of the present invention which 
will be limited only by the appended claims. 

It must be noted that as used herein and in the appended claims, the singular forms "a," "an," 
30 and 'the" include plural reference unless the context clearly dictates otherwise. Thus, for example, a 
reference to "a host cell" includes a plurality of such host cells, and a reference to "an antibody" is a 
reference to one or more antibodies and equivalents thereof known to those skilled in the art, and so 
forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same 
35 meanings as commonly understood by one of ordinary skill in the art to which this invention belongs. 
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Although any machines, materials, and methods similar or equivalent to those described herein can be 
used to practice or test the present invention, the preferred machines, materials and methods are now 
described. All publications mentioned herein are cited for the purpose of describing and disclosing 
the cell lines, protocols, reagents and vectors which are reported in the publications and which might 
be used in connection with the invention. Nothing herein is to be construed as an admission that the 
invention is not entitled to antedate such disclosure by virtue of prior invention. 
DEFINITIONS 

"TRICH" refers to the amino acid sequences of substantially purified TRICH obtained from 
any species, particularly a mammalian species, including bovine, ovine, porcine, murine, equine, and 
human, and from any source, whether natural, synthetic, semi-synthetic, or recombinant 

The term "agonist" refers to a molecule which intensifies or mimics the biological activity of 
TRICH. Agonists may include proteins, nucleic acids, carbohydrates, small molecules, or any other 
compound or composition which modulates the activity of TRICH either by directly interacting with 
TRICH or by acting on components of the biological pathway in which TRICK participates. 

An "allelic variant" is an alternative form of the gene encoding TRICH. Allelic variants may 
result from at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in 
polypeptides whose structure or function may or may not be altered. A gene may have none, one, or 
many allelic variants of its naturally occurring form. Common mutational changes which give rise to 
allelic variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. 
Each of these types of changes may occur alone, or in combination with the others, one or more times 
in a given sequence. 

"Altered" nucleic acid sequences encoding TRICH include those sequences with deletions, 
insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as TRICH or a 
polypeptide with at least one functional characteristic of TRICH. Included within this definition are 
polymorphisms which may or may not be readily detectable using a particular oligonucleotide probe 
of the polynucleotide encoding TRICH, and improper or unexpected hybridization to allelic variants, 
with a locus other than the normal chromosomal locus for the polynucleotide sequence encoding 
TRICH. The encoded protein may also be "altered," and may contain deletions, insertions, or 
substitutions of amino acid residues which produce a silent change and result in a functionally 
equivalent TRICH Deliberate amino acid substitutions may be made on the basis of similarity in 
polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the 
residues, as long as the biological or immunological activity of TRICH is retained. For example, 
negatively charged amino acids may include aspartic acid and glutamic acid, and positively charged 
amino acids may include lysine and arginine. Amino acids with uncharged polar side chains having 
similar hydrophilicity values may include: asparagine and glutamine; and serine and threonine. 
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Amino acids with uncharged side chains having similar hydrophilicity values may include: leucine, 
isoleucine, and valine; glycine and alanine; and phenylalanine and tyrosine. 

The terms "amino acid" and "amino acid sequence" refer to an oligopeptide, peptide, 
polypeptide, or protein sequence, or a fragment of any of these, and to naturally occurring or synthetic 
molecules. Where "amino acid sequence" is recited to refer to a sequence of a naturally occurring 
protein molecule, "amino acid sequence" and like terms are not meant to limit the amino acid 
sequence to the complete native amino acid sequence associated with the recited protein molecule. 

"Amplification" relates to the production of additional copies of a nucleic acid sequence. 
Amplification is generally carried out using polymerase chain reaction (PGR) technologies well 
known in the art. 

The term "antagonist" refers to a molecule which inhibits or attenuates the biological activity 
of TRICH. Antagonists may include proteins such as antibodies, nucleic acids, carbohydrates, small 
molecules, or any other compound or composition which modulates the activity of TRICH either by 
directly interacting with TRICH or by acting on components of the biological pathway in which 
TRICH participates. 

The term "antibody" refers to intact immunoglobulin molecules as well as to fragments 
thereof, such as Fab, F(ab') 2 , and Fv fragments, which are capable of binding an epitopic determinant 
Antibodies that bind TRICH polypeptides can be prepared using intact polypeptides or using 
fragments containing small peptides of interest as the immunizing antigen. The polypeptide or 
oligopeptide used to immunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the 
translation of RNA, or synthesized chemically, and can be conjugated to a carrier protein if desired. 
Commonly used carriers that are chemically coupled to peptides include bovine serum albumin, 
thyroglobulin, and keyhole limpet hemocyanin (KLH). The coupled peptide is then used to immunize 
the animal , 

The term "antigenic determinant" refers to that region of a molecule (i.e., an epitope) that 
makes contact with a particular antibody. When a protein or a fragment of a protein is used to 
immunize a host animal, numerous regions of the protein may induce the production of antibodies 
which bind specifically to antigenic determinants (particular regions or three-dimensional structures 
on the protein). An antigenic determinant may compete with the intact antigen (i.e., the immunogen ~ 
used to elicit the immune response) for binding to an antibody. 

The term "aptamer" refers to a nucleic acid or oligonucleotide molecule that binds to a 
specific molecular target. Aptamers are derived from an in vitro evolutionary process (e.g., SELEX 
(Systematic Evolution of Iigands by Exponential Enrichment), described in U.S. Patent No. 
5,270,163), which selects for target-specific aptamer sequences from large combinatorial libraries. 
Aptamer compositions may be double-stranded or single-stranded, and may include 
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deoxyribonucleotides, ribonucleotides, nucleotide derivatives, or other nucleotide-like molecules. 
The nucleotide components of an aptamer may have modified sugar groups (e.g., the 2'-OH group of a 
ribonucleotide may be replaced by 2-F or 2 -NKy, which may improve a desired property, e.g., 
resistance to nucleases or longer lifetime in blood. Aptamers may be conjugated to other molecules, 
5 e.g., a high molecular weight carrier to slow clearance of the aptamer from the circulatory system. 
Aptamers may be specifically cross-linked to their cognate ligands, e.g., by photo-activation of a 
cross-linker. (See, e.g., Brody, E.N. and L. Gold (2000) J. BiotechnoL 74:5-13.) 

The term "inttamer" refers to an aptamer which is expressed in vivo . For example, a vaccinia 
virus-based RNA expression system has been used to express specific RNA aptamers at high levels in 
10 the cytoplasm of leukocytes (Blind, M. et al. (1999) Proc. Nad Acad. Sci. USA 96:3606-3610). 

The term "spiegelmer" refers to an aptamer which includes L-DNA, L-RNA, or other left- 
handed nucleotide derivatives or nucleotide-like molecules. Aptamers containing left-handed 
nucleotides are resistant to degradation by naturally occurring enzymes, which normally act on 
, substrates containing right-handed nucleotides. 
15 The term "antisense" refers to any composition capable of base-pairing with the "sense" 

(coding) strand of a specific nucleic acid sequence. Antisense compositions may include DNA; 
RNA; peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as 
phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides having modified 
sugar groups such as 2-methoxyethyl sugars or 2-methoxyethoxy sugars; or oligonucleotides having 
20 modified bases such as 5-methyl cytosine, 2'-deoxyuracil, or 7-deaza-2'-deoxyguanosine. Antisense 
molecules may be produced by any method including chemical synthesis or transcription. Once 
introduced into a cell, the complementary antisense molecule base-pairs with a naturally occurring 
nucleic acid sequence produced by the cell to form duplexes which block either transcription or 
translation. The designation "negative" or "minus" can refer to the antisense strand, and the 
25 designation "positive" or "plus" can refer to the sense strand of a reference DNA molecule. 

The term "biologically active" refers to a protein having structural, regulatory, or biochemical 
functions of a naturally occurring molecule. Likewise, "immunologically active" or "iimnunogenic" 
refers to the capability of the natural, recombinant, or synthetic TRICH, or of any oligopeptide 
thereof, to induce a specific immune response in appropriate animals or cells and to bind with specific 
30 antibodies. 

"Complementary" describes the relationship between two single-stranded nucleic acid 
sequences that anneal by base-pairing. For example, 5-AGT-3' pairs with its complement, 
3'-TCA-5\ 

A "composition comprising a given polynucleotide sequence" and a "composition comprising 
35 a given amino acid sequence" refer broadly to any composition containing the given polynucleotide 
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or amino acid sequence. The composition may comprise a dry formulation or an aqueous solution. 
Compositions comprising polynucleotide sequences encoding TRICH or fragments of TRICH may be 
employed as hybridization probes. The probes may be stored in freeze-dried form and may be 
associated with a stabilizing agent such as a carbohydrate. In hybridizations, the probe may be 
5 deployed in an aqueous solution containing salts (e.g., Nad), detergents (e.g., sodium dodecyl 
sulfate; SDS), and other components (e.g M Denhaidt's solution, dry milk, salmon spenn DNA, etc.). 

"Consensus sequence" refers to a nucleic acid sequence which has been subjected to repeated 
DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (Applied 
Biosystems, Foster City CA) in the 5* and/or the 3* direction, and resequenced, or which has been 

10 assembled from one or more overlapping cDNA, EST, or genomic DNA fragments using a computer 
program for fragment assembly, such as the GELVIEW fragment assembly system (GCG, Madison 
WI) or Phrap (University of Washington, Seattle WA). Some sequences have been both extended and 
assembled to produce the consensus sequence. 

"Conservative amino acid substitutions" are those substitutions that are predicted to least 

15 interfere with the properties of the original protein, i.e., the structure and especially the function of 
the protein is conserved and not significantly changed by such substitutions. The table below shows 
amino acids which may be substituted fox an original amino acid in a protein and which are regarded 





as conservative amino acid substitutions. 

Original Residue 


Conservative Substitution 


20 


Ala 


Gly, Ser 




Arg 


His, Lys 




Asn 


Asp, Gin, His 




Asp 


Asn, Glu 




Cys 


Ala, Ser 


25 


Gin 


Asn, Glu, IBs 




Glu 


Asp, Gin, His 




Gly 


Ala 




His 


Asn, Arg, Gin, Glu 




lie 


Leu, Val 


30 


Leu 


He, Val 




Lys 


Arg, Gin, Glu 




Met 


Leu, Be. 




Phe 


Ks, Met, Leu, Trp, Tyr 




Ser 


Cys, Thr 


35 


Thr 


Ser, Val 




Trp 


Phe, Tyr 




Tyr 


His, Phe, Trp 




Val 


He, Leu, Thr 



Conservative amino acid substitutions generally maintain (a) the structure of the polypeptide 
backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, 
(b) the charge or hydrophobicity of the molecule at the site of the substitution, and/or (c) the bulk of 
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the side chain. 

A "deletion" refers to a change in the amino acid or nucleotide sequence that results in the 
absence of one or more amino acid residues or nucleotides. 

The term "derivative" refers to a chemically modified polynucleotide or polypeptide. 
Chemical modifications of a polynucleotide can include, for example, replacement of hydrogen by an 
alkyl, acyl, hydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which 
retains at least one biological or immunological function of the natural molecule. A derivative 
polypeptide is one modified by glycosylation, pegylation, or any similar process that retains at least 
one biological or immunological function of the polypeptide from which it was derived. 

A "detectable label" refers to a reporter molecule or enzyme that is capable of generating a 
measurable signal and is covalently or noncovalently joined to a polynucleotide or polypeptide. 

"Differential expression" refers to increased or upregulated; or decreased, downregulated, or 
absent gene or protein expression, determined by comparing at least two different samples. Such 
comparisons may be carried out between, for example, a treated and an untreated sample, or a 
diseased and a normal sample. 

"Exon shuffling" refers to the recombination of different coding regions (exons). Since an 
exon may represent a structural or functional domain of the encoded protein, new proteins may be 
assembled through the novel reassortment of stable substructures, thus allowing acceleration of the 
evolution of new protein functions. 

A "fragment" is a unique portion of TRICH or the polynucleotide encoding TRICH which is 
identical in sequence to but shorter in length than the parent sequence. A fragment may comprise up 
to the entire length of the defined sequence, minus one nucleotide/amino acid residue. For example, a 
fragment may comprise from 5 to 1000 contiguous nucleotides or amino acid residues. A fragment 
used as a probe, primer, antigen, therapeutic molecule, or for other purposes, may be at least 5, 10, 
15, 16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500 contiguous nucleotides or amino acid 
residues in length. Fragments may be preferentially selected from certain regions of a molecule. For 
example, a polypeptide fragment may comprise a certain length of contiguous amino acids selected 
from the first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certain 
defined sequence. Clearly these lengths are exemplary, and any length that is supported by the 
specification, including the Sequence Listing, tables, and figures, may be encompassed by the present 
embodiments. 

A fragment of SEQ ID NO:21-40 comprises a region of unique polynucleotide sequence that 
specifically identifies SEQ ID NO:21-40, for example, as distinct from any other sequence in the 
genome from which the fragment was obtained. A fragment of SEQ ID N0:21-40 is useful, for 
example, in hybridization and amplification technologies and in analogous methods that distinguish 



24 



WO 02/40541 



PCT/US01/46055 



SEQ ID NO:21-40 from related polynucleotide sequences. The precise length of a fragment of SEQ 
ID NO:21-40 and the region of SEQ ID NO:21^tt) to which the fragment corresponds are routinely 
determinable by one of ordinary skill in the art based on the intended purpose for the fragment. 

A fragment of SEQ ID NO:1-20 is encoded by a fragment of SEQ ID NO:21-40. A fragment 
5 of SEQ ID NO: 1-20 comprises a region of unique amino acid sequence that specifically identifies 
SEQ ID NO:1-20. For example, a fragment of SEQ ID NO: 1-20 is useful as an immunogenic peptide 
for the development of antibodies that specifically recognize SEQ ID NO: 1-20. The precise length of 
a fragment of SEQ ID NO: 1-20 and the region of SEQ ID NO: 1-20 to which the fragment 
corresponds are routinely determinable by one of ordinary skill in the art based on the intended 
10 purpose for the fragment 

A "full length" polynucleotide sequence is one containing at least a translation initiation 
codon (e.g., methionine) followed by an open reading frame and a translation termination codon. A 
"full length" polynucleotide sequence encodes a "full length" polypeptide sequence. 

"Homology" refers to sequence similarity or, interchangeably, sequence identity, between 
15 two or more polynucleotide sequences or two or more polypeptide sequences. 

The terms e< percent identity" and "% identity," as applied to polynucleotide sequences, refer 
to the percentage of residue matches between at least two polynucleotide sequences aligned using a 
standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps 
in the sequences being compared in order to optimize alignment between two sequences, and 
20 therefore achieve a more meaningful comparison of the two sequences. 

Percent identity between polynucleotide sequences may be determined using the default 
parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3. 12e 
sequence alignment program This program is part of the LASERGENE software package, a suite of 
molecular biological analysis programs (DNASTAR, Madison WI). CLUSTAL V is described in 
25 Higgins, D.G. and P.M. Sharp (1989) CABIOS 5:151-153 and in Higgins, D.G. et al. (1992) CABIOS 
8:189-191. For pairwise alignments of polynucleotide sequences, the default parameters are set as 
follows: Ktuple=2, gap penalty=5, window=4, and "diagonals saved"=4. The "weighted" residue 
weight table is selected as the default Percent identity is reported by CLUSTAL V as the "percent 
similarity" between aligned polynucleotide sequences. 
30 Alternatively, a suite of commonly used and freely available sequence comparison algorithms 

is provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment 
Search Tool (BLAST) (Altschul, S.R et al. (1990) J. Mol. Biol. 215:403^10), which is available 
from several sources, including the NCBI, Bethesda, MD, and on the Internet at 
http://www.ncbi.nlm.nih.gov/BLAST/. The BLAST software suite includes various sequence 
35 analysis programs including "blastn ," that is used to align a known polynucleotide sequence with 
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other polynucleotide sequences from a variety of databases. Also available is a tool called "BLAST 2 
Sequences" that is used for direct pairwise comparison of two nucleotide sequences. "BLAST 2 
Sequences" can be accessed and used interactively at http://www.ncbi.n1m.nih.gov/gorgbl2.html. 
The "BLAST 2 Sequences" tool can be used for both blastn and blastp (discussed below). BLAST 
5 programs are commonly used with gap and other parameters set to default settings. For example, to 
compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 
2.0.12 (April-21-2000) set at default parameters. Such default parameters may be, for example: 

Matrix: BLOSUM62 

Reward for match: 1 
10 Penalty for mismatch: -2 

Open Gap: 5 and Extension Gap: 2 penalties 

Gap x drop-off: 50 

Expect: 10 

Word Size: 11 
15 Filter: on 

Percent identity may be measured over the length of an entire defined sequence, for example, 
as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, 
over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at 
least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous 

20 nucleotides. Such lengths are exemplary only, and it is understood that any fragment length 

supported by the sequences shown herein, in the tables, figures, or Sequence Listing, may be used to 
describe a length over which percentage identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes 

25 in a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid 
sequences that all encode substantially the same protein. 

The phrases "percent identity" and "% identity," as applied to polypeptide sequences, refer to 
the percentage of residue matches between at least two polypeptide sequences aligned using a 
standardized algorithm. Methods of polypeptide sequence alignment are well-known. Some 

30 alignment methods take into account conservative amino acid substitutions. Such conservative 

substitutions, explained in more detail above, generally preserve the charge andjiydrophobicity at the 
site of substitution, thus preserving the structure (and therefore function) of the polypeptide. 

Percent identity between polypeptide sequences may be determined using the default 
parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e 

35 sequence alignment program (described and referenced above). For pairwise alignments of 
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polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap 
penalty=3, window=5, and "diagonals saved"=5. The PAM250 matrix is selected as the default 
residue weight table. As with polynucleotide alignments, the percent identity is reported by 
CLUSTAL V as the "percent similarity" between aligned polypeptide sequence pairs. 

Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise 
comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences" tool Version 
2.0. 12 (April-21-2000) with blastp set at default parameters. Such default parameters may be, for 
example: 

Matrix: BLOSUM62 

Open Gap: 11 and Extension Gap: 1 penalties 
Gap x drop-off: 50 > 
Expect: 10 
Word Size: 3 
Filter: on 

Percent identity may be measured over the length of an entire defined polypeptide sequence, 
for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for 
example, over the length of a fragment taken from a larger, defined polypeptide sequence, for 
instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 
150 contiguous residues. Such lengths are exemplary only, and it is understood that any fragment 
length supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be 
used to describe a length over which percentage identity may be measured. 

"Human artificial chromosomes" (HACs) are linear microchromosomes which may contain 
DNA sequences of about 6 kb to 10 Mb in size and which contain all of the elements required for 
chromosome replication, segregation and maintenance. 

The term "humanized antibody" refers to an antibody molecule in which the amino acid 
sequence in the non-antigen binding regions has been altered so that the antibody more closely 
resembles a human antibody, and still retains its original binding ability. 

"Hybridization" refers to the process by which a polynucleotide strand anneals with a 
complementary strand through base pairing under defined hybridization conditions. Specific 
hybridization is an indication that two nucleic acid sequences share a high degree of complementarity. 
Specific hybridization complexes form under permissive annealing conditions and remain hybridized 
after the "washing" step(s). The washing step(s) is particularly important in determining the 
stringency of the hybridization process, with more stringent conditions allowing less non-specific 
binding, i.e., binding between pairs of nucleic acid strands that are not perfectly matched. Permissive 
conditions for annealing of nucleic acid sequences are routinely determinable by one of ordinary skill 
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in the art and may be consistent among hybridization experiments, whereas wash conditions may be 
varied among experiments to achieve the desired stringency, and therefore hybridization specificity. 
Permissive annealing conditions occur, for example, at 68°C in the presence of about 6 x SSC, about 
1% (w/v) SDS, and about 100 /ig/ml sheared, denatured salmon sperm DNA. 
5 Generally, stringency of hybridization is expressed, in part, with reference to the temperature 

under which the wash step is carried out Such wash temperatures are typically selected to be about 
5°C to 20°C lower than the thermal melting point (T J for the specific sequence at a defined ionic 
strength and pH. The T m is the temperature (under defined ionic strength and pH) at which 50% of 
the target sequence hybridizes to a perfectly matched probe. An equation for calculating T m and 

10 conditions for nucleic acid hybridization are well known and can be found in Sambrook, J. et al. 
(1989) Molecular Cloning: A Laboratory Manual, 2 nd ed., vol. 1-3, Cold Spring Harbor Press, 
Plainview NY; specifically see volume 2, chapter 9. 

High stringency conditions for hybridization between polynucleotides of the present 
invention include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1% SDS, 

15 for 1 hour. Alternatively, temperatures of about 65°C, 60°C, 55°C or 42°C may be used. SSC 
concentration may be varied from about 0.1 to 2 x SSC, with SDS being present at about 0.1%. 
Typically, blocking reagents are used to block non-specific hybridization. Such blocking reagents 
include, for instance, sheared and denatured salmon sperm DNA at about 100-200 /zg/ml. Organic 
solvent, such as formamide at a concentration of about 35-50% v/v, may also be used under particular 

20 circumstances, such as for RNA.DNA hybridizations. Useful variations on these wash conditions 
will be readily apparent to those of ordinary skill in the art. Hybridization, particularly under high 
stringency conditions, may be suggestive of evolutionary similarity between the nucleotides. Such 
similarity is strongly indicative of a similar role for the nucleotides and their encoded polypeptides. 
The term **hybridization complex" refers to a complex formed between two nucleic acid 

25 sequences by virtue of the formation of hydrogen bonds between complementary bases. A 

hybridization complex may be formed in solution (e.g., Q>t or Rot analysis) or formed between one 
nucleic acid sequence present in solution and another nucleic acid sequence immobilized on a solid 
support (e.g., paper, membranes, filters, chips, pins or glass slides, or any other appropriate substrate 
to which cells or their nucleic acids have been fixed). 

30 The words "insertion" and "addition" refer to changes in an amino acid or nucleotide 

sequence resulting in the addition of one or more amino acid residues or nucleotides, respectively. 

'Immune response" can refer to conditions associated with inflammation, trauma, immune 
disorders, or infectious or genetic disease, etc. These conditions can be characterized by expression 
of various factors, e.g., cytokines, chemokines, and other signaling molecules, which may affect 

35 cellular and systemic defense systems. 
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An "immunogenic fragment" is a polypeptide or oligopeptide fragment of TRICH which is 
capable of eliciting an immune response when introduced into a living organism, for example, a 
m a mm al. The term "immunogenic fragment" also includes any polypeptide or oligopeptide fragment 
of TRICH which is useful in any of the antibody production methods disclosed herein or known in the 
art 

The term "microarray" refers to an arrangement of a plurality of polynucleotides, 
polypeptides, or other chemical compounds on a substrate. 

The terms "element" and "array element* ' refer to a polynucleotide, polypeptide, or other 
chemical compound having a unique and defined position on a microarray. 

The term **modulate" refers to a change in the activity of TRICH. For example, modulation 
may cause an increase or a decrease in protein activity, binding characteristics, or any other 
biological, functional, or immunological properties of TRICH. 

The phrases **nucleic acid" and ''nucleic acid sequence" refer to a nucleotide, oligonucleotide, 
polynucleotide, or any fragment thereof. These phrases also refer to DNA or RNA of genomic or 
synthetic origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA), or to any DNA-like or RNA-like material. 

"Operably linked" refers to the situation in which a first nucleic acid sequence is placed in a 
functional relationship with a second nucleic acid sequence. For instance, a promoter is operably 
linked to a coding sequence if the promoter affects the transcription or expression of the coding 
sequence. Operably linked DNA sequences may be in close proximity or contiguous and, where 
necessary to join two protein coding regions, in the same reading frame. 

"Peptide nucleic acid" (PNA) refers to an antisense molecule or anti-gene agent which 
comprises an oligonucleotide of at least about 5 nucleotides in length linked to a peptide backbone of 
amino acid residues ending in lysine. The terminal lysine confers solubility to the composition. 
PNAs preferentially bind complementary single stranded DNA or RNA and stop transcript 
elongation, and may be pegylated to extend their lifespan in the cell. 

"Post-translational modification" of an TRICH may involve lipidation, glycosylation, 
phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in 
the art These processes may occur synthetically or biochemically. Biochemical modifications will 
vary by cell type depending on the enzymatic milieu of TRICH 

"Probe" refers to nucleic acid sequences encoding TRICH, their complements, or fragments 
thereof, which are used to detect identical, allelic or related nucleic acid sequences. Probes are 
isolated oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. 
Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. 
"Primers" are short nucleic acids, usually DNA oligonucleotides, which may be annealed to a target 
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polynucleotide by complementary base-pairing. The primer may then be extended along the target 
DNA strand by a DNA polymerase enzyme. Primer pairs can be used for amplification (and 
identification) of a nucleic acid sequence, e.g., by the polymerase chain reaction (PCR). 

Probes and primers as used in the present invention typically comprise at least 15 contiguous 
5 nucleotides of a known sequence. la order to enhance specificity, longer probes and primers may also 
be employed, such as probes and primers that comprise at least 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 
or at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers 
may be considerably longer than these examples, and it is understood that any length supported by the 
specification, including the tables, figures, and Sequence Listing, may be used. 

10 Methods for preparing and using probes and primers are described in the references, for 

example Sambrook, J. et al. (1989) Molecular Cloning: A Laboratory Manual. 2 nd ed, vol. 1-3, Cold 
Spring Harbor Press, Plainview NY; Ausubel, F.M. et al. (1987) Current Protocols in Molecular 
Biolojav. Greene Publ. Assoc. & Wiley-Ihtersciences, New York NY; Innis, M. et al. (1990) PCR 
Protocols. A Guide to Methods and Applications, Academic Press. San Diego CA. PCR primer pairs 

15 can be derived from a known sequence, for example, by using computer programs intended for that 
purpose such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge 
MA). 

Oligonucleotides for use as primers are selected using software known in the art for such 
purpose. For example, OLIGO 4.06 software is useful for the selection of PCR primer pairs of up to 

20 100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 
5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer 
selection programs have incorporated additional features for expanded capabilities. For example, the 
PrimOU primer selection program (available to the public from the Genome Center at University of 
Texas South West Medical Center, Dallas TX) is capable of choosing specific primers from 

25 megabase sequences and is thus useful for designing primers on a genome-wide scope. The Primer3 
primer selection program (available to the public from the Whitehead Institute/MIT Center for 
Genome Research, Cambridge MA) allows the user to input a '^sprinting library," in which 
sequences to avoid as primer binding sites are user-specified. Primer3 is useful, in particular, for the 
selection of oligonucleotides for microarrays. (The source code for the latter two primer selection 

30 programs may also be obtained from their respective sources and modified to meet the user's specific 
needs.) The PrimeGen program (available to the public from the UK Human Genome Mapping 
Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, 
thereby allowing selection of primers that hybridize to either the most conserved or least conserved 
regions of aligned nucleic acid sequences. Hence, this program is useful for identification of both 

35 unique and conserved oligonucleotides and polynucleotide fragments. The oligonucleotides and 
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polynucleotide fragments identified by any of the above selection methods are useful in hybridization 
technologies, for example, as PGR or sequencing primers, microarray elements, or specific probes to 
identify fully or partially complementary polynucleotides in a sample of nucleic acids. Methods of 
oligonucleotide selection are not limited to those described above. 
5 A ''recombinant nucleic acid" is a sequence that is not naturally occurring or has a sequence 

that is made by an artificial combination of two or more otherwise separated segments of sequence* 
This artificial combination is often accomplished by chemical synthesis or, more commonly, by the 
artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques 
such as those described in Samhrook, supra. The term recombinant includes nucleic acids that have 
10 been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. Frequently, a 
recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter 
sequence. Such a recombinant nucleic acid may be part of a vector that is used, for example, to 
transform a cell. 

Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a 
15 vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
expressed, inducing a protective immunological response in the mammal . 

A '^regulatory element" refers to a nucleic acid sequence usually derived from untranslated 
regions of a gene and includes enhancers, promoters, introns, and 5' and 3' untranslated regions 
(UTRs). Regulatory elements interact with host or viral proteins which control transcription, 
20 translation, or RNA stability. 

"Reporter molecules" are chemical or biochemical moieties used for labeling a nucleic acid, 
amino acid, or antibody. Reporter molecules include radionuclides; enzymes; fluorescent, 
chemiluminescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and 
other moieties known in the art 
25 An "RNA equivalent," in reference to a DNA sequence, is composed of the same linear 

sequence of nucleotides as the reference DNA sequence with the exception that all occurrences of the 
nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose 
instead of deoxyribose. 

The term "sample" is used in its broadest sense. A sample suspected of containing TRICH, 
30 nucleic acids encoding TRICH, or fragments thereof may comprise a bodily fluid; an extract from a 
cell, chromosome, organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA, or 
cDNA, in solution or bound to a substrate; a tissue; a tissue print; etc. 

The terms "specific binding" and "specifically binding" refer to that interaction between a 
protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or 
35 synthetic binding composition. The interaction is dependent upon the presence of a particular 
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structure of the protein, e.g., the antigenic determinant or epitope, recognized by the binding 
molecule. For example, if an antibody is specific for epitope "A," the presence of a polypeptide 
comprising the epitope A, or the presence of free unlabeled A, in a reaction containing free labeled A 
and the antibody will reduce the amount of labeled A that binds to the antibody. 

The term "substantially purified" refers to nucleic acid or amino acid sequences that are 
removed from their natural environment and are isolated or separated, and are at least 60% free, 
preferably at least 75% free, and most preferably at least 90% free from other components with which 
they are naturally associated. 

A "substitution" refers to the replacement of one or more amino acid residues or nucleotides 
by different amino acid residues or nucleotides, respectively. 

"Substrate" refers to any suitable rigid or semi-rigid support including membranes, filters, 
chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 
microparticles and capillaries. The substrate can have a variety of surface forms, such as wells, 
trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound. 

A "transcript image" or "expression profile" refers to the collective pattern of gene 
expression by a particular cell type or tissue under given conditions at a given time. 

"Transformation" describes a process by which exogenous DNA is introduced into a recipient 
cell. Transformation may occur under natural or artificial conditions according to various methods 
well known in the art, and may rely on any known method for the insertion of foreign nucleic acid 
sequences into a prokaryotic or eukaryotic host cell. The method for transformation is selected based 
on the type of host cell being transformed and may include, but is not limited to, bacteriophage or 
viral infection, electroporation, heat shock, lipofection, and particle bombardment The term 
"transformed cells" includes stably transformed cells in which the inserted DNA is capable of 
replication either as an autonomously replicating plasmid or as part of the host chromosome, as well 
as transiently transformed cells which express the inserted DNA or RNA for limited periods of time. 

A "transgenic organism, 11 as used herein, is any organism, including but not limited to 
animals and plants, in which one or more of the cells of the organism contains heterologous nucleic 
acid introduced by way of human intervention, such as by transgenic techniques well known in the 
art The nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor 
of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with 
a recombinant virus. The term genetic manipulation does not include classical cross-breeding, or jn 
vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule. The 
transgenic organisms contemplated in accordance with the present invention include bacteria, 
cyanobacteria, fungi, plants and animals. The isolated DNA of the present invention can be 
introduced into the host by methods known in the art, for example infection, transfection, 
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transformation or transconjugation. Techniques for transferring the DNA of the present invention 
into such organisms are widely known and provided in references such as Sambrook et aL (1989), 
supra . 

A "valiant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having 
5 at least 40% sequence identity to the particular nucleic acid sequence over a certain length of one of 
the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 
1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 
93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater 

10 sequence identity over a certain defined length. A variant may be described as, for example, an 

"allelic" (as defined above), "splice," "species," or "polymorphic" variant A splice variant may have 
significant identity to a reference molecule, but will generally have a greater or lesser number of 
polynucleotides due to alternate splicing of exons during mRNA processing. The corresponding 
polypeptide may possess additional functional domains or lack domains that are present in the 

15 reference molecule. Species variants are polynucleotide sequences that vary from one species to 
another. The resulting polypeptides will generally have significant amino acid identity relative to 
each other. A polymorphic variant is a variation in the polynucleotide sequence of a particular gene 
between individuals of a given species. Polymorphic variants also may encompass "single nucleotide 
polymorphisms" (SNPs) in which the polynucleotide sequence varies by one nucleotide base. The 

20 presence of SNPs may be indicative of, for example, a certain population, a disease state, or a 
propensity for a disease state. 

A "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having 
at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of 
the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 

25 1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 
94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence 
identity over a certain defined length of one of the polypeptides. 

30 THE INVENTION 

The invention is based on the discovery of new human transporters and ion channels 
(TRICH), the polynucleotides encoding TRICH, and the use of these compositions for the diagnosis, 
treatment, or prevention of transport, neurological, muscle, immunological and cell proliferative 
disorders. 

35 Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide 



33 



WO 02/40541 



PCT/US01/46055 



sequences of the invention. Each polynucleotide and its corresponding polypeptide are correlated to a 
single Incyte project identification number (Incyte Project ID). Each polypeptide sequence is denoted 
by both a polypeptide sequence identification number (Polypeptide SEQ ID NO:) and an Incyte 
polypeptide sequence number (Incyte Polypeptide ID) as shown. Each polynucleotide sequence is 
denoted by both a polynucleotide sequence identification number (Polynucleotide SEQ ID NO:) and 
an Incyte polynucleotide consensus sequence number (Incyte Polynucleotide ID) as shown. 

Table 2 shows sequences with homology to the polypeptides of the invention as identified by 
BLAST analysis against the GenBank protein (genpept) database. Columns 1 and 2 show the 
polypeptide sequence identification number (Polypeptide SEQ ID NO:) and the corresponding Incyte 
polypeptide sequence number (Incyte Polypeptide ID) for polypeptides of the invention. Column 3 
shows the GenBank identification number (GenBank ID NO:) of the nearest GenBank homolog. 
Column 4 shows the probability scores for the matches between each polypeptide and its homolog(s). 
Column 5 shows the annotation of the GenBank homologs along with relevant citations where 
applicable, all of which are expressly incorporated by reference herein. 

Table 3 shows various structural features of the polypeptides of the invention. Columns 1 
and 2 show the polypeptide sequence identification number (SEQ ID NO:) and the corresponding 
Incyte polypeptide sequence number (Incyte Polypeptide ID) for each polypeptide of the invention. 
Column 3 shows the number of amino acid residues in each polypeptide. Column 4 shows potential 
phosphorylation sites, and column 5 shows potential glycosylation sites, as determined by the 
MOTIFS program of the GCG sequence analysis software package (Genetics Computer Group, 
Madison WI). Column 6 shows amino acid residues comprising signature sequences, domains, and 
motifs. Column 7 shows analytical methods for protein structure/function analysis and in some cases, 
searchable databases to which the analytical methods were applied. 

Together, Tables 2 and 3 summarize the properties of polypeptides of the invention, and these 
properties establish that the claimed polypeptides are transporters and ion channels. For example, 
SEQ ID NO:5 is 61% identical to Drosophila sodium-hydrogen exchanger NHE1 (GenBank ID 
g4894991) as determined by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The 
BLAST probability score is 6.0e-139, which indicates the probability of obtaining the observed 
polypeptide sequence alignment by chance. SEQ ID NO:5 also contains a sodiuntfhydrogen 
exchanger family domain as determined by searching for statistically significant matches in the 
hidden Markov model (HMM)-based PFAM database of conserved protein family domains. (See 
Table 3.) Data from BLIMPS analysis provides further corroborative evidence that SEQ ID NO:5 is a 
sodium/hydrogen exchanger. In an alternative example, SEQ ID NO:6 is about 50% identical to 
human citrin, the adult-onset type II citmllinemia protein, (GenBank ID g5052319) as determined by 
the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 
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6.0e-51, which indicates the probability of obtaining the observed polypeptide sequence alignment by 
chance. SEQ ID NO:6 also contains mitochondrial carrier protein domains as determined by 
searching for statistically significant matches in the hidden Markov model (HMM)-based PFAM 
database of conserved protein family domains. (See Table 3.) Data from BLIMPS, MOTIFS, and 
5 PROFILESCAN analyses provide further corroborative evidence that SEQ ID NO:6 is a 
mitochondrial carrier protein. In an alternative example, SEQ ID NO:7 is 27% identical to 
Svnechocvstis sp. melibiose carrier protein (GenBank ID gl653342) as determined by the Basic 
Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 1.8e~16, 
which indicates the probability of obtaining the observed polypeptide sequence alignment by chance. 

10 Additional BLAST data from DOMO and PRODOM analyses provide further corroborative evidence 
that SEQ ID NO:7 is a symporter protein. In an alternative example, SEQ ID NO:9 is 26% identical 
to an Arabidopsis ABC transporter (GenBank ID g4262239) and is 99% identical, from residue Ml to 
residue W374, to human sterolin-2 (GenBank ID gl5146444) as determined by the Basic Local 
Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability scores are 4. le-25 and 0.0 

15 respectively, which indicate the probabilities of obtaining the observed polypeptide sequence 

alignments by chance. SEQ ID NO:9 contains two transmembrane domains as determined by hidden 
Markov model (HMM) analysis, as well as a white/scarlet ABC transporter domain. (See Table 3.) 
These data provide further corroborative evidence that SEQ ID NO:9 is an ABC transporter. In an 
alternative example, SEQ ID NO: 12 is 93% identical to rat neuronal glutamine transporter (GenBank 

20 ID g6978016) as determined by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) 
The BLAST probability score is 4.4e-239, which indicates the probability of obtaining the observed 
polypeptide sequence alignment by chance. SEQ ID NO: 12 also contains a transmembrane amino 
acid transporter domain as determined by searching for statistically significant matches in the hidden 
Markov model (HMM)-based PFAM database of conserved protein family domains. (See Table 3.) 

25 These data provide corroborative evidence that SEQ ID NO: 12 is an amino acid transporter protein. 
In an alternative example, SEQ ID NO: 14 is 52% identical to mouse multidrug resistance protein 
(GenBank ID g387426) as determined by the Basic Local Alignment Search Tool (BLAST). (See 
Table 2.) The BLAST probability score is 0.0, which indicates the probability of obtaining the 
observed polypeptide sequence alignment by chance. SEQ ID NO: 14 also contains an ABC 

30 transporter domain and an ABC transporter transmembrane region domain as determined by 

searching for statistically significant matches in the hidden Markov model (HMM>based PFAM 
database of conserved protein family domains. (See Table 3.) Data from BLIMPS, MOTIFS, and 
PROFILESCAN analyses provide further corroborative evidence that SEQ ID NO: 14 is a multidrug 
resistance ABC transporter. In an alternative example, SEQ ID NO: 18 is 41 % identical to 

35 Arabidopsis putative membrane transporter (GenBank ID g2289003) and is 99% identical, from 
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residue M20 to residue E648, to human proton myo-inositol transporter (GenBank ID gl521 1933) as 
determined by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST 
probability scores are 1.4e-94 and 0.0 respectively, which indicate the probabilities of obtaining the 
observed polypeptide sequence alignments by chance. SEQ ID NO: 18 also contains a sugar (and 
5 other) transporter domain as determined by searching for statistically significant matches in the 
hidden Markov model (HMM>based PFAM database of conserved protein family domains. (See 
Table 3.) Data from BLIMPS, MOTIFS, and PROFILESCAN analyses provide further corroborative 
evidence that SEQ ID NO: 18 is a sugar transporter. SEQ ID NO: 1-4, SEQ ID NO:8, SEQ ID NO: 10- 
.11, SEQ ID NO: 13, SEQ ID NO: 15-17, and SEQ ID NO: 19-20 were analyzed and annotated in a 
10 similar manner. The algorithms and parameters for the analysis of SEQ ID NO: 1-20 are described in 
Table 7. 

As shown in Table 4, the full length polynucleotide sequences of the present invention were 
assembled using cDNA sequences or coding (exon) sequences derived from genomic DNA, or any 
combination of these two types of sequences. Columns 1 and 2 list the polynucleotide sequence 

15 identification number (Polynucleotide SEQ ID NO:) and the corresponding Incyte polynucleotide 
consensus sequence number (Incyte Polynucleotide ID) for each polynucleotide of the invention. 
Column 3 shows the length of each polynucleotide sequence in basepairs. Column 4 lists fragments 
of the polynucleotide sequences which are useful, for example, in hybridization or amplification 
technologies that identify SEQ ID NO:21-40 or that distinguish between SEQ ID NO:21-40 and 

20 related polynucleotide sequences. Column 5 shows identification numbers corresponding to cDNA 
sequences, coding sequences (exons) predicted from genomic DNA, and/or sequence assemblages 
comprised of both cDNA and genomic DNA. These sequences were used to assemble the full length 
polynucleotide sequences of the invention. Columns 6 and 7 of Table 4 show the nucleotide start (5*) 
and stop (3') positions of the cDNA and/or genomic sequences in column 5 relative to their respective 

25 full length sequences. 

The identification numbers in Column 5 of Table 4 may refer specifically, for example, to 
Incyte cDNAs along with their corresponding cDNA libraries. For example, 6122382H1 is the 
identification number of an Incyte cDNA sequence, and BRAHNON05 is the cDNA library from 
which it is derived. Incyte cDNAs for which cDNA libraries are not indicated were derived from 

30 pooled cDNA libraries (e.g., 72008374V1). Alternatively, the identification numbers in column 5 
may refer to GenBank cDNAs or ESTs (e.g., g2077361) which contributed to the assembly of the full 
length polynucleotide sequences. In addition, the identification numbers in column 5 may identify 
sequences derived from the ENSEMBL (The Sanger Centre, Cambridge, UK) database (i.e„ those 
sequences including the designation 'TBNST"). Alternatively, the identification numbers in column 5 

35 may be derived from the NCBI RefSeq Nucleotide Sequence Records Database (ie., those sequences 
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including the designation "NM" or "NT") or the NCBI RefSeq Protein Sequence Records (Le., those 
sequences including the designation "NP"). Alternatively, the identification numbers in column 5 
may refer to assemblages of both cDNA and Genscan-predicted exons brought together by an "exon 
stitching" algorithm For example, FLJOCXXXX Jf I JV 2 jnfYYYJl 3 J1 4 represents a "stitched" 
sequence in which XXXXXX is the identification number of the cluster of sequences to which the 
algorithm was applied, and YYYYY is the number of the prediction generated by the algorithm, and 
N 12 j^ if present, represent specific exons that may have been manually edited during analysis (See 
Example V). Alternatively, the identification numbers in column 5 may refer to assemblages of 
exons brought together by an "exon-stretching" algorithm For example, 
FUQOOOQt£AAAAA_gPBBBB_J.JV is the identification number of a "stretched" sequence, with 
XXXXXX being the Incyte project identification number, gAAAAA being the GenBank identification 
number of the human genomic sequence to which the "exon-stretching" algorithm was applied, 
gBBBBB being the GenBank identification number or NCBI RefSeq identification number of the 
nearest GenBank protein homolog, and Preferring to specific exons (See Example V). In instances 
where a RefSeq sequence was used as a protein homolog for the "exon-stretching" algorithm, a 
RefSeq identifier (denoted by "NM," "NP," or <C NT") may be used in place of the GenBank identifier 
(Le., gBBBBB). 

Alternatively, a prefix identifies component sequences that were hand-edited, predicted from 
genomic DNA sequences, or derived from a combination of sequence analysis methods. The 
following Table lists examples of component sequence prefixes and corresponding sequence analysis 
methods associated with the prefixes (see Example IV and Example V). 



Prefix 


Type of analysis and/or examples of programs 


GNN, GFG, 
ENST 


Exon prediction from genomic sequences using, for example, 
GENSCAN (Stanford University, CA, USA) or FGENES 
(Computer Genomics Group, The Sanger Centre, Cambridge, UK). 


GBI 


Hand-edited analysis of genomic sequences. 


FL 


Stitched or stretched genomic sequences (see Example V). 


INCY 


Full length transcript and exon prediction from mapping of EST 
sequences to the genome. Genomic location and EST composition 
data are combined to predict the exons and resulting transcript 



In some cases, Incyte cDNA coverage redundant with the sequence coverage shown in 
column 5 was obtained to confirm the final consensus polynucleotide sequence, but the relevant 
Incyte cDNA identification numbers are not shown. 

Table 5 shows the representative cDNA libraries for those full length polynucleotide 
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sequences which were assembled using Incyte cDNA sequences. The representative cDNA library is 
the Incyte cDNA library which is most frequently represented by the Incyte cDNA sequences which 
were used to assemble and confirm the above polynucleotide sequences. The tissues and vectors 
which were used to construct the cDNA libraries shown in Table 5 are described in Table 6. 
5 The invention also encompasses TRICH variants. A preferred TRICH variant is one which 

has at least about 80%, or alternatively at least about 90%, or even at least about 95% amino acid 
sequence identity to the TRICH amino acid sequence, and which contains at least one functional ox 
structural characteristic of TRICBL 

The invention also encompasses polynucleotides which encode TRICH. In a particular 

10 embodiment, the invention encompasses a polynucleotide sequence comprising a sequence selected 
from the group consisting of SEQ ID NO.21-40, which encodes TRICH. The polynucleotide 
sequences of SEQ ID NO:21-40, as presented in the Sequence Listing, embrace the equivalent RNA 
sequences, wherein occurrences of the nitrogenous base thymine are replaced with uracil, and the 
sugar backbone is composed of ribose instead of deoxyribose. 

15 The invention also encompasses a variant of a polynucleotide sequence encoding TRICH. In 

particular, such a variant polynucleotide sequence will have at least about 70%, or alternatively at 
least about 85%, or even at least about 95% polynucleotide sequence identity to the polynucleotide 
sequence encoding TRICH. A particular aspect of the invention encompasses a variant of a 
polynucleotide sequence comprising a sequence selected from the group consisting of SEQ ID 

20 NO:21-40 which has at least about 70%, or alternatively at least about 85%, or even at least about 
95% polynucleotide sequence identity to a nucleic acid sequence selected from the group consisting 
of SEQ ID NO:21-40. Any one of the polynucleotide variants described above can encode an amino 
acid sequence which contains at least one functional or structural characteristic of TRICH. 

In addition, or in the alternative, a polynucleotide variant of the invention is a splice variant 

25 of a polynucleotide sequence encoding TRICH. A splice variant may have portions which have 

significant sequence identity to the polynucleotide sequence encoding TRICH, but will generally have 
a greater or lesser number of polynucleotides due to additions or deletions of blocks of sequence 
arising from alternate splicing of exons during mRNA processing. A splice variant may have less 
than about 70%, or alternatively less than about 60%, or alternatively less than about 50% 

30 polynucleotide sequence identity to the polynucleotide sequence encoding TRICH over its entire 
length; however, portions of the splice variant will have at least about 70%, or alternatively at least 
about 85%, or alternatively at least about 95%, or alternatively 100% polynucleotide sequence 
identity to portions of the polynucleotide sequence encoding TRICH. Any one of the splice variants 
described above can encode an amino acid sequence which contains at least one functional or 

35 structural characteristic of TRICH. 
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It will be appreciated by those skilled in the art that as a result of the degeneracy of the 
genetic code, a multitude of polynucleotide sequences encoding TRICH, some bearing minimal 
similarity to the polynucleotide sequences of any known and naturally occurring gene, may be 
produced. Thus, the invention contemplates each and every possible variation of polynucleotide 
5 sequence that could be made by selecting combinations based on possible codon choices. These 
combinations are made in accordance with the standard triplet genetic code as applied to the 
polynucleotide sequence of naturally occurring TRICH, and all such variations are to be considered 
as being specifically disclosed. 

Although nucleotide sequences which encode TRICH and its variants are generally capable of 

10 hybridizing to the nucleotide sequence of the naturally occurring TRICH under appropriately selected 
conditions of stringency, it may be advantageous to produce nucleotide sequences encoding TRICH 
or its derivatives possessing a substantially different codon usage, e.g., inclusion of non-naturally 
occurring codons. Codons may be selected to increase the rate at which expression of the peptide 
occurs in a particular prokaryotic or eukaryotic host in accordance with the frequency with which 

15 particular codons are utilized by the host. Other reasons for substantially altering the nucleotide 
sequence encoding TRICH and its derivatives without altering the encoded amino acid sequences 
include the production of RNA transcripts having more desirable properties, such as a greater 
half-life, than transcripts produced from the naturally occurring sequence. 

The invention also encompasses production of DNA sequences which encode TRICH and 

20 TRICH derivatives, or fragments thereof, entirely by synthetic chemistry. After production, the 
synthetic sequence may be inserted into any of the many available expression vectors and cell 
systems using reagents well known in the art Moreover, synthetic chemistry may be used to 
introduce mutations into a sequence encoding TRICH or any fragment thereof. 

Also encompassed by the invention are polynucleotide sequences that are capable of 

25 hybridizing to the claimed polynucleotide sequences, and, in particular, to those shown in SEQ ID 
NO:21-40 and fragments thereof under various conditions of stringency. (See, e.g., Wahl, G.M. and 
S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods Enzymol. 
152:507-51 1.) Hybridization conditions, including annealing and wash conditions, are described in 
definitions." 

30 Methods for DNA sequencing are well known in the art and may be used to practice any of 

the embodiments of the invention. The methods may employ such enzymes as the Klenow fragment 
of DNA polymerase I, SEQUENASE (US Biochemical, Cleveland OH), Taq polymerase (Applied 
Biosystems), thermostable T7 polymerase (Amersham Pharmacia Biotech, Piscataway NJ), or 
combinations of polymerases and proofreading exonucleases such as those found in the ELONGASE 

35 amplification system (life Technologies, Gaithersburg MD). Preferably, sequence preparation is 
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automated with machines such as the MICROLAB 2200 liquid transfer system (Hamilton, Reno NV), 
PTC200 thermal cycler (MJ Research, Watertown MA) and ABI CATALYST 800 thermal cycler 
(Applied Biosystems). Sequencing is then carried out using either the ABI 373 or 377 DNA 
sequencing system (Applied Biosystems), the MEGABACE 1000 DNA sequencing system 
5 (Molecular Dynamics, Sunnyvale CA), or other systems known in the art The resulting sequences 
are analyzed using a variety of algorithms which are well known in the art (See, e.g., Ausubel, F.M. 
(1997) Short Protocols in Molecular Biology . John Wiley & Sons, New York NY, unit 7.7; Meyers, 
R.A. (1995) Molecular Biology and Biotechnology, Wiley VCH, New York NY, pp. 856-853.) 

The nucleic acid sequences encoding TRICH may be extended utilizing a partial nucleotide 

10 sequence and employing various PCR-based methods known in the art to detect upstream sequences, 
such as promoters and regulatory elements. For example, one method which may be employed, 
restriction-site PCR, uses universal and nested primers to amplify unknown sequence from genomic 
DNA within a cloning vector. (See, e.g., Sarkar, G. (1993) PCR Methods Applic. 2:318-322.) 
Another method, inverse PCR, uses primers that extend in divergent directions to amplify unknown 

15 sequence from a circularized template. The template is derived from restriction fragments comprising 
a known genomic locus and surrounding sequences. (See, e.g., Triglia, T. et al. (1988) Nucleic Acids 
Res. 16:8186.) A third method, capture PCR, involves PCR amplification of DNA fragments 
adjacent to known sequences in human and yeast artificial chromosome DNA. (See, e.g., Lagerstrom, 
M. et al. (1991) PCR Methods Applic. 1:111-119.) In this method, multiple restriction enzyme 

20 digestions and ligations may be used to insert an engineered double-stranded sequence into a region 
of unknown sequence before performing PCR. Other methods which may be used to retrieve 
unknown sequences are known in the art (See, e.g., Parker, J.D. et al. (1991) Nucleic Acids Res. 
19:3055-3060). Additionally, one may use PCR, nested primers, and PROMOTERHNDER libraries 
(Qontech, Palo Alto CA) to walk genomic DNA. This procedure avoids the need to screen libraries 

25 and is useful in finding intron/exon junctions. For all PCR-based methods, primers may be designed 
using commercially available software, such as OLIGO 4.06 primer analysis software (National 
Biosciences, Plymouth MN) or another appropriate program, to be about 22 to 30 nucleotides in 
length, to have a GC content of about 50% or more, and to anneal to the template at temperatures of 
about 68°C to 72°C. 

30 When screening for full length cDNAs, it is preferable to use libraries that have been 

size-selected to include larger cDNAs. In addition, random-primed libraries, which often include 
sequences containing the 5 1 regions of genes, are preferable for situations in which an oligo d(T) 
library does not yield a full-length cDNA. ' Genomic libraries may be useful for extension of sequence 
into 5* non-transcribed regulatory regions. 

35 Capillary electrophoresis systems which are commercially available may be used to analyze 
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the size or confirm the nucleotide sequence of sequencing or PCR products. In particular, capillary 
sequencing may employ flowable polymers for electrophoretic separation, four different nucleotide- 
specific, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the 
emitted wavelengths. Output/light intensity may be converted to electrical signal using appropriate 
5 software (e.g., GENOTYPER and SEQUENCE NAVIGATOR, Applied Biosystems), and the entire 
process from loading of samples to computer analysis and electronic data display may be computer 
controlled. Capillary electrophoresis is especially preferable for sequencing small DNA fragments 
which may be present in limited amounts in a particular sample. 

In another embodiment of the invention, polynucleotide sequences or fragments thereof 

10 which encode TRICH may be cloned in recombinant DNA molecules that direct expression of 

TRICH, or fragments or functional equivalents thereof, in appropriate host cells. Due to the inherent 
degeneracy of the genetic code, other DNA sequences which encode substantially the same or a 
functionally equivalent amino acid sequence may be produced and used to express TRICH. 

The nucleotide sequences of the present invention can be engineered using methods generally 

15 known in the art in order to alter TRICH-encoding sequences for a variety of purposes including, but 
not limited to, modification of the cloning, processing, and/or expression of the gene product DNA 
shuffling by random fragmentation and PCR reassembly of gene fragments and synthetic 
oligonucleotides may be used to engineer the nucleotide sequences. For example, oligonucleotide- 
mediated site-directed mutagenesis may be used to introduce mutations that create new restriction 

20 sites, alter glycosylation patterns, change codon preference, produce splice variants, and so forth. 

The nucleotides of the present invention may be subjected to DNA shuffling techniques such 
as MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent No. 
5,837,458; Chang, C-C. et al. (1999) Nat Biotechnol. 17:793-797; Christians, F.C. et al. (1999) Nat. 
BiotechnoL 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol. 14:315-319) to alter or 

25 improve the biological properties of TRICH, such as its biological or enzymatic activity or its ability 
to bind to other molecules or compounds. DNA shuffling is a process by which a library of gene 
variants is produced using PCR-mediated recombination of gene fragments. The library is then 
subjected to selection or screening procedures that identify those gene variants with the desired 
properties. These preferred variants may then be pooled and further subjected to recursive rounds of 

30 DNA shuffling and selection/screening. Thus, genetic diversity is created through "artificial" 

breeding and rapid molecular evolution. For example, fragments of a single gene containing random 
point mutations may be recombined, screened, and then reshuffled until the desired properties are 
optimized. Alternatively, fragments of a given gene may be recombined with fragments of 
homologous genes in the same gene family, either from the same or different species, thereby 

35 maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable 
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manner. 

In another embodiment, sequences encoding TRICH may be synthesized, in whole or in part, 
using chemical methods well known in the art (See, e.g., Caruthers, M.H. et al. (1980) Nucleic Acids 
Symp. Ser. 7:215-223; and Horn, T. et al. (1980) Nucleic Acids Symp. Ser. 7:225-232.) 
5 Alternatively, TRICH itself or a fragment thereof may be synthesized using chemical methods. For 
example, peptide synthesis can be performed using various solution-phase or solid-phase techniques. 
(See, e.g., Creighton, T. (1984) Proteins, Structures and Molecular Properties . WH Freeman, New 
York NY, pp. 55-60; and Roberge, J. Y. et al. (1995) Science 269:202-204.) Automated synthesis 
may be achieved using the ABI 431 A peptide synthesizer (Applied Biosystems). Additionally, the 

10 amino acid sequence of TRICH, or any part thereof, may be altered during direct synthesis and/or 
combined with sequences from other proteins, or any part thereof, to produce a variant polypeptide or 
a polypeptide having a sequence of a naturally occurring polypeptide. 

The peptide may be substantially purified by preparative high performance liquid 
chromatography. (See, e.g., Chiez, R.M. and F.Z. Regnier (1990) Methods Enzymol. 182:392-421.) 

15 The composition of the synthetic peptides may be confirmed by amino acid analysis or by 
sequencing. (See, e.g., Creighton, supra, pp. 28-53.) 

In order to express a biologically active TRICH, the nucleotide sequences encoding TRICH 
or derivatives thereof may be inserted into an appropriate expression vector, i.e., a vector which 
contains the necessary elements for transcriptional and translational control of the inserted coding 

20 sequence in a suitable host These elements include regulatory sequences, such as enhancers, 
constitutive and inducible promoters, and 5' and 3' untranslated regions in the vector and in 
polynucleotide sequences encoding TRICH. Such elements may vary in their strength and specificity. 
Specific initiation signals may also be used to achieve more efficient translation of sequences 
encoding TRICH. Such signals include the ATG initiation codon and adjacent sequences, e.g. the 

25 Kozak sequence. In cases where sequences encoding TRICH and its initiation codon and upstream 
regulatory sequences are inserted into the appropriate expression vector, no additional transcriptional 
or translational control signals may be needed. However, in cases where only coding sequence, or a 
fragment thereof, is inserted, exogenous translational control signals including an in-frame ATG 
initiation codon should be provided by the vector. Exogenous translational elements and initiation 

30 codons may be of various origins, both natural and synthetic. The efficiency of expression may be 
enhanced by the inclusion of enhancers appropriate for the particular host cell system used (See, 
e.g., Scharf, D. et al. (1994) Results Probl. Cell Differ. 20: 125-162.) 

Methods which are well known to those skilled in the art may be used to construct expression 
vectors containing sequences encoding TRICH and appropriate transcriptional and translational 

35 control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, 
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and in vivo genetic recombination. (See, e.g., Sambrook, J. et al. (1989) Molecular Cloning, A 
Laboratory Manual, Cold Spring Harbor Press, Plainview NY, ch. 4, 8, and 16-17; Ausubel, FM. et 
al. (1995) Current Protocols in Molecular Biology. John Wiley & Sons, New York NY, ch. 9, 13, and 
16.) 

5 A variety of expression vectotfliost systems may be utilized to contain and express sequences 

encoding TRICK. These include, but are not limited to, microorganisms such as bacteria transformed 
with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with 
yeast expression vectors; insect cell systems infected with viral expression vectors (e.g., baculovirus); 
plant cell systems transformed with viral expression vectors (e.g., cauliflower mosaic virus, CaMV, 

10 or tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or 
animal cell systems. (See, e.g., Sambrook, supra : Ausubel, supra ; Van Heeke, G. and S.M. Schuster 
(1989) J. Biol. Chem. 264:5503-5509; Engelhard, E.K. et al. (1994) Proc. Natl. Acad. ScL USA 
91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Iter. 7:1937-1945; Takamatsu, N. (1987) EMBO 
J. 6:307-31 1; The McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New 

15 York NY, pp. 191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad. Sci. USA 81:3655-3659; and 
Harrington, J. J. et al. (1997) Nat. Genet 15:345-355.) Expression vectors derived from retroviruses, 
adenoviruses, or herpes or vaccinia viruses, or from various bacterial plasmids, may be used for 
delivery of nucleotide sequences to the targeted organ, tissue, or cell population. (See, e.g., Di 
Nicola, M. et al. (1998) Cancer Gen. Then 5(6):350-356; Yu, M. et al. (1993) Proc. Natl. Acad. Sci. 

20 USA 90(13):6340-6344; Buller, R.M. et al. (1985) Nature 317(6040):813-815; McGregor, D.P. et al. 
(1994) Mol. Immunol. 31(3):219-226; and Verma, LM. and N. Somia (1997) Nature 389:239-242.) 
The invention is not limited by the host cell employed. 

In bacterial systems, a number of cloning and expression vectors may be selected depending 
upon the use intended for polynucleotide sequences encoding TRICH. For example, routine cloning, 

25 subcloning, and propagation of polynucleotide sequences encoding TRICH can be achieved using a 
multifunctional E. coli vector such as PBLUESCRIPT (Stratagene, La Jolla CA) or PSPORT1 
plasmid (Life Technologies). Ligation of sequences encoding TRICH into the vector's multiple 
cloning site disrupts the lacZ gene, allowing a colorimetric screening procedure for identification of 
transformed bacteria containing recombinant molecules. Iq addition, these vectors may be useful for 

30 in vitro transcription, dideoxy sequencing, single strand rescue with helper phage, and creation of 
nested deletions in the cloned sequence. (See, e.g., Van Heeke, G. and S.M. Schuster (1989) J. Biol. 
Chem. 264:5503-5509.) When large quantities of TRICH are needed, e.g. for the production of 
antibodies, vectors which direct high level expression of TRICH may be used. For example, vectors 
containing the strong, inducible SP6 or T7 bacteriophage promoter may be used. 

35 Yeast expression systems may be used for production of TRICH. A number of vectors 
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containing constitutive or inducible promoters, such as alpha factor, alcohol oxidase, and PGH 
promoters, may be used in the yeast Saccharomvces cerevisiae or Pichia pastoris . In addition, such 
vectors direct either the secretion or intracellular retention of expressed proteins and enable 
integration of foreign sequences into the host genome for stable propagation. (See, e.g., Ausubel, 

5 1995, supra : Bitter, G.A. et al. (1987) Methods EnzymoL 153:516-544; and Scorer, C.A. et aL (1994) 
Bio/Technology 12:181-184.) 

Plant systems may also be used for expression of TRICH. Transcription of sequences 
encoding TRICH may be driven by viral promoters, e.g., the 35S and 19S promoters of CaMV used 
alone or in combination with the omega leader sequence from TMV (Takamatsu, N. (1987) EMBO J. 

10 6:307-31 1). Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock 
promoters may be used. (See, e.g., Coruzzi, G. et aL (1984) EMBO J. 3: 1671-1680; Broglie, R. et aL 
(1984) Science 224:838-843; and Winter, J. et aL (1991) Results Probl. Cell Differ. 17:85-105.) 
These constructs can be introduced into plant cells by direct DNA transformation or 
pathogen-mediated transfection. (See, e.g.. The McGraw Hill Yearbook of Science and Technology 

15 (1992) McGraw Hill, New York NY, pp. 191-196.) 

In mammalian cells, a number of viral-based expression systems may be utilized. In cases 
where an adenovirus is used as an expression vector, sequences encoding TRICH may be ligated into 
an adenovirus transcription/translation complex consisting of the late promoter and tripartite leader 
sequence. Insertion in a non-essential El or E3 region of the viral genome may be used to obtain 

20 infective virus which expresses TRICH in host cells. (See, e.g., Logan, J. and T. Shenk (1984) Proc. 
Natl. Acad. Sci. USA 81:3655-3659.) In addition, transcription enhancers, such as the Rous sarcoma 
virus (RSV) enhancer, may be used to increase expression in mammalian host cells. SV40 or EB V- 
based vectors may also be used for high-level protein expression. 

Human artificial chromosomes (HACs) may also be employed to deliver larger fragments of 

25 DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are 
constructed and delivered via conventional delivery methods (liposomes, polycationic amino 
polymers, or vesicles) for therapeutic purposes. (See, e.g., Harrington, JJ. et al. (1997) Nat Genet 
15:345-355.) 

For long term production of recombinant proteins in mammalian systems, stable expression 
30 of TRICH in cell lines is preferred. For example, sequences encoding TRICH can be transformed 
into cell lines using expression vectors which may contain viral origins of replication and/or 
endogenous expression elements and a selectable marker gene on the same or on a separate vector. 
Following the introduction of the vector, cells may be allowed to grow for about 1 to 2 days in 
enriched media before being switched to selective media. The purpose of the selectable marker is to 
35 confer resistance to a selective agent, and its presence allows growth and recovery of cells which 
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successfully express the introduced sequences. Resistant clones of stably transformed cells may be 
propagated using tissue culture techniques appropriate to the cell type. 

Any number of selection systems may be used to recover transformed cell lines. These 
include, but are not limited to, the herpes simplex virus thymidine kinase and adenine 
5 phosphoribosyltransferase genes, for use in tk and apf cells, respectively. (See, e.g., Wigler, M. et 
al. (1977) Cell 11:223-232; Lowy, L et al. (1980) Cell 22:817-823.) Also, antimetabolite, antibiotic, 
or herbicide resistance can be used as the basis for selection. For example, dhfr confers resistance to 
methotrexate; neo confers resistance to the aminoglycosides neomycin and G-418; and als and pat 
confer resistance to chlorsulfuron and phosphinotricin acetyltransferase, respectively. (See, e.g., 

10 Wigler, M. et al. (1980) Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) 
J. Mol. Biol. 150:1-14.) Additional selectable genes have been described, e.g., trpB and hisD, which 
alter cellular requirements for metabolites. (See, e.g., Hartman, S.C. and R.C. Mulligan (1988) Proc. 
Natl. Acad. Sci. USA 85:8047-8051.) Visible markers, e.g., anthocyanins, green fluorescent proteins 
(GFP; Clontech), 8 glucuronidase and its substrate B-glucuronide, or luciferase and its substrate 

15 luciferin may be used. These markers can be used not only to identify transformants, but also to 
quantify the amount of transient or stable protein expression attributable to a specific vector system. 
(See, e.g., Rhodes, C.A. (1995) Methods Mol. Biol. 55:121-131.) 

Although the presence/absence of marker gene expression suggests that the gene of interest is 
also present, the presence and expression of the gene may need to be confirmed. For example, if the 

20 sequence encoding TRICH is inserted within a marker gene sequence, transformed cells containing 
sequences encoding TRICH can be identified by the absence of marker gene function. Alternatively, 
a marker gene can be placed in tandem with a sequence encoding TRICH under the control of a single 
promoter. Expression of the marker gene in response to induction or selection usually indicates 
expression of the tandem gene as well. 

25 In general, host cells that contain the nucleic acid sequence encoding TRICH and that express 

TRICH may be identified by a variety of procedures known to those of skill in the art These 
procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCR 
amplification, and protein bioassay or immunoassay techniques which include membrane, solution, or 
chip based technologies for the detection and/or quantification of nucleic acid or protein sequences. 

30 Immunological methods for detecting and measuring the expression of TRICH using either 

specific polyclonal or monoclonal antibodies are known in the art Examples of such techniques 
include enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays (RIAs), and 
fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing 
monoclonal antibodies reactive to two non-interfering epitopes on TRICH is preferred, but a 

35 competitive binding assay may be employed. These and other assays are well known in the art. (See, 
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e.g., Hampton, R. et al. (1990) Serological Methods, a Laboratory Manual . APS Press, St. Paul MN, 
Sect IV; Coligan, J.E. et al. (1997) Current Protocols in Immunology. Greene Pub. Associates and 
Wiley-Interscience, New York NY; and Pound, J.D. (1998) Immunochemical Protocols . Humana 
Press, TotowaNJ.) 

5 A wide variety of labels and conjugation techniques are known by those skilled in the art and 

may be used in various nucleic acid and amino acid assays. Means for producing labeled 
hybridization or PCR probes for detecting sequences related to polynucleotides encoding TRICH 
include oligolabeling, nick translation, end-labeling, or PCR amplification using a labeled nucleotide. 
Alternatively, the sequences encoding TRICH, or any fragments thereof, may be cloned into a vector 

10 for the production of an mRNA probe. Such vectors are known in the art, are commercially available, 
and may be used to synthesize RNA probes in vitro by addition of an appropriate RNA polymerase 
such as 17, T3, or SP6 and labeled nucleotides. These procedures may be conducted using a variety 
of commercially available kits, such as those provided by Amersham Pharmacia Biotech, Promega 
(Madison WI), and US Biochemical. Suitable reporter molecules or labels which may be used for 

15 ease of detection include radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic 
agents, as well as substrates, cofactors, inhibitors, magnetic particles, and the like. 

Host cells transformed with nucleotide sequences encoding TRICH may be cultured under 
conditions suitable for the expression and recovery of the protein from cell culture. The protein 
produced by a transformed cell may be secreted or retained intracellularly depending on the sequence 

20 and/or the vector used. As will be understood by those of skill in the art, expression vectors 

containing polynucleotides which encode TRICH may be designed to contain signal sequences which 
direct secretion of TRICH through a prokaryotic or eukaryotic cell membrane. 

In addition, a host cell strain may be chosen for its ability to modulate expression of the ' 
inserted sequences or to process the expressed protein in the desired fashion. Such modifications of 

25 the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, 

phosphorylation, lipidation, and acylation. Post-translational processing which cleaves a "prepro" or 
"pro" form of the protein may also be used to specify protein targeting, folding, and/or activity. 
Different host cells which have specific cellular machinery and characteristic mechanisms for 
post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available from the 

30 American Type Culture Collection (ATCC, Manassas VA) and may be chosen to ensure the correct 
modification and processing of the foreign protein. 

In another embodiment of the invention, natural, modified, or recombinant nucleic acid 
sequences encoding TRICH may be ligated to a heterologous sequence resulting in translation of a 
fusion protein in any of the aforementioned host systems. For example, a chimeric TRICH protein 

35 containing a heterologous moiety that can be recognized by a commercially available antibody may 
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facilitate the screening of peptide libraries for inhibitors of TRICH activity. Heterologous protein and 
peptide moieties may also facilitate purification of fusion proteins using commercially available 
affinity matrices. Such moieties include, but are not limited to, glutathione S-transferase (GST), 
maltose binding protein (MBP), thioredoxin (Trx), calmodulin binding peptide (CBP), 6-His, FLAG, 
5 c-myc y and hemagglutinin (HA). GST, MBP, Trx, CBP, and 6-His enable purification of their 
cognate fusion proteins on immobilized glutathione, maltose, phenylarsine oxide, calmodulin, and 
metal-chelate resins, respectively. FLAG, c-myc, and hemagglutinin (HA) enable immunoaffinity 
purification of fusion proteins using commercially available monoclonal and polyclonal antibodies 
that specifically recognize these epitope tags. A fusion protein may also be engineered to contain a 
10 proteolytic cleavage site located between the TRICH encoding sequence and the heterologous protein 
sequence, so that TRICH may be cleaved away from the heterologous moiety following purification. 
Methods for fusion protein expression and purification are discussed in Ausubel (1995, supra, ch. 10). 
A variety of commercially available kits may also be used to facilitate expression and purification of 
fusion proteins. 

15 In a further embodiment of the invention, synthesis of radiolabeled TRICH may be achieved 

in vitro using the TNT rabbit reticulocyte lysate or wheat germ extract system (Promega). These 
systems couple transcription and translation of protein-coding sequences operably associated with the 
17, T3, or SP6 promoters. Translation takes place in the presence of a radiolabeled amino acid 
precursor, for example, 35 S-methionine. 

20 TRICH of the present invention or fragments thereof may be used to screen for compounds 

that specifically bind to TRICH. At least one and up to a plurality of test compounds may be 
screened for specific binding to TRICH. Examples of test compounds include antibodies, 
oligonucleotides, proteins (e.g., receptors), or small molecules. 

In one embodiment, the compound thus identified is closely related to the natural ligand of 

25 TRICH, e.g., a ligand or fragment thereof, a natural substrate, a structural or functional mimetic, or a 
natural binding partner. (See, e.g., Coligan, JB. et al. (1991) Current Pro tocols m Immunology 1(2): 
Chapter 5.) Similarly, the compound can be closely related to the natural receptor to which TRICH 
binds, or to at least a fragment of the receptor, e.g., the ligand binding site. In either case, the 
compound can be rationally designed using known techniques. In one embodiment, screening for 

30 these compounds involves producing appropriate cells which express TRICH, either as a secreted 
protein or on the cell membrane. Preferred cells include cells from mammals, yeast, Drosophila, or 
E. colL Cells expressing TRICH or cell membrane fractions which contain TRICH are then contacted 
with a test compound and binding, stimulation, or inhibition of activity of either TRICH or the 
compound is analyzed. 

35 An assay may simply test binding of a test compound to the polypeptide, wherein binding is 
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detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label For example, 
the assay may comprise the steps of combining at least one test compound with TRICH, either in 
solution or affixed to a solid support, and detecting the binding of TRICH to the compound. 
Alternatively, the assay may detect or measure binding of a test compound in the presence of a 
5 labeled competitor. Additionally, the assay may be carried out using cell-free preparations, chemical 
libraries, or natural product mixtures, and the test compound(s) may be free in solution or affixed to a 
solid support. 

TRICH of the present invention or fragments thereof may be used to screen for compounds 
that modulate the activity of TRICH. Such compounds may include agonists, antagonists, or partial 

10 or inverse agonists. In one embodiment, an assay is performed under conditions permissive for 
TRICH activity, wherein TRICH is combined with at least one test compound, and the activity of 
TRICH in the presence of a test compound is compared with the activity of TRICH in the absence of 
the test compound. A change in the activity of TRICH in the presence of the test compound is 
indicative of a compound that modulates the activity of TRICH. Alternatively, a test compound is 

15 combined with an in vitro or cell-free system comprising TRICH under conditions suitable for 
TRICH activity, and the assay is performed. In either of these assays, a test compound which 
modulates the activity of TRICH may do so indirectly and need not come in direct contact with the 
test compound. At least one and up to a plurality of test compounds may be screened. 

In another embodiment, polynucleotides encoding TRICH or their mammalian homologs may 

20 be "knocked out" in an animal model system using homologous recombination in embryonic stem 
(ES) cells. Such techniques are well known in the art and are useful for the generation of animal 
models of human disease. (See, e.g., U.S. Patent No. 5,175,383 and U.S. Patent No. 5,767,337.) For 
example, mouse ES cells, such as the mouse 129/SvJ cell line, are derived from the early mouse 
embryo and grown in culture. The ES cells are transformed with a vector containing the gene of 

25 interest disrupted by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. 
(1989) Science 244: 1288-1292). The vector integrates into the corresponding region of the host 
genome by homologous recombination. Alternatively, homologous recombination takes place using 
the Cre-loxP system to knockout a gene of interest in a tissue- or developmental stage-specific 
manner (Marth, J.D. (1996) Clin. Invest 97:1999-2002; Wagner, K.U. et al. (1997) Nucleic Acids 

30 Res. 25:4323-4330). Transformed ES cells are identified and microinjected into mouse cell 

blastocysts such as those from the C57BL/6 mouse strain. The blastocysts are surgically transferred 
to pseudopregnant dams, and the resulting chimeric progeny are genotyped and bred to produce 
heterozygous or homozygous strains. Transgenic animals thus generated may be tested with potential 
therapeutic or toxic agents. 

35 Polynucleotides encoding TRICH may also be manipulated in vitro in ES cells derived from 
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human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell 
lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate 
into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J.A. et al. 
(1998) Science 282:1145-1147). 

5 Polynucleotides encoding TRICH can also be used to create "knockin" humanized animals 

(pigs) or transgenic animals (mice or rats) to model human disease. With knockin technology, a 
region of a polynucleotide encoding TRICH is injected into animal ES cells, and the injected 
sequence integrates into the animal cell genome. Transformed cells are injected into blastulae, and 
the blastulae are implanted as described above. Transgenic progeny or inbred lines are studied and 

10 treated with potential pharmaceutical agents to obtain information on treatment of a human disease. 
Alternatively, a mammal inbred to overexpress TRICH, e.g., by secreting TRICH in its milk, may also 
serve as a convenient source of that protein (Janne, J. et al. (1998) BiotechnoL Annu. Rev. 4:55-74). 
THERAPEUTICS 

Chemical and structural similarity, e.g., in the context of sequences and motifs, exists 
15 between regions of TRICH and transporters and ion channels. In addition, the expression of TRICH 
is closely associated with tumorous tissues such as spleen tumor tissue, esophageal tumor tissue, 
brain tumor tissue, and myxoma from atrium tissue; and normal tissues such as kidney, liver, nasal 
polyp, prostate, thyroid, umbilical coord blood, neuronal, digestive, uterine endometrial tissue, and 
normal brain tissue such as the tissues from striatum, globus pallidus, and posterior putamen. 
20 Therefore, TRICH appears to play a role in transport, neurological, muscle, immunological and cell 
proliferative disorders. In the treatment of disorders associated with increased TRICH expression or 
activity, it is desirable to decrease the expression or activity of TRICH. In the treatment of disorders 
associated with decreased TRICH expression or activity, it is desirable to increase the expression or 
activity of TRICH. 

25 Therefore, in one embodiment, TRICH or a fragment or derivative thereof may be 

administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of TRICH. Examples of such disorders include, but are not limited to, a transport disorder 
such as akinesia, amyotrophic lateral sclerosis, ataxia telangiectasia, cystic fibrosis, Becker's 
muscular dystrophy, Bell's palsy, Charcot-Marie Tooth disease, diabetes mellitus, diabetes insipidus, 

30 diabetic neuropathy, Duchenne muscular dystrophy, hyperkalemic periodic paralysis, normokalemic 
periodic paralysis, Parkinson's disease, malignant hyperthermia, multidrug resistance, myasthenia 
gravis, myotonic dystrophy, catatonia, tardive dyskinesia, dystonias, peripheral neuropathy, cerebral 
neoplasms, prostate cancer, cardiac disorders associated with transport, e.g., angina, bradyanythmia, 
tachyarrythmia, hypertension, Long QT syndrome, myocarditis, cardiomyopathy, nemaline 

35 myopathy, centronuclear myopathy, lipid myopathy, mitochondrial myopathy, thyrotoxic myopathy, 
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ethanol myopathy, dermatomyositis, inclusion body myositis, infectious myositis, polymyositis, 
neurological disorders associated with transport, e.g., Alzheimer's disease, amnesia, bipolar disorder, 
dementia, depression, epilepsy, Tourette's disorder, paranoid psychoses, and schizophrenia, and other 
disorders associated with transport, e.g., neurofibromatosis, postherpetic neuralgia, trigeminal 
5 neuropathy, sarcoidosis, sickle cell anemia, Wilson's disease, cataracts, infertility, pulmonary artery 
stenosis, sensorineural autosomal deafness, hyperglycemia, hypoglycemia, Grave's disease, goiter, 
Cushing's disease, Addison's disease, glucose-galactose malabsorption syndrome, glycogen storage 
disease, hypercholesterolemia, adrenoleukodystrophy, Zellweger syndrome, Menkes disease, 
occipital horn syndrome, von Gierke disease, pseudohypoaldosteronism type 1, Liddle's syndrome, 

10 cystinuria, iminoglycinuria, Hartup disease, Fanconi disease, and Bartter syndrome; a neurological 
disorder such as epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's 
disease, Pick's disease, Huntington's disease, dementia, Parkinson's disease and other extrapyramidal 
disorders, amyotrophic lateral sclerosis and other motor neuron disorders, progressive neural 
muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating 

15 diseases, bacterial and viral meningitis, brain abscess, subdural empyema, epidural abscess, 
suppurative intracranial thrombophlebitis, myelitis and radiculitis, viral central nervous system 
disease, prion diseases including kuru, Creutzfeldt Jakob disease, and Gerstmann- 
Straussler-Scheinker syndrome, fatal familial insomnia, nutritional and metabolic diseases of the 
nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal hemangioblastomatosis, 

20 encephalotrigeroinal syndrome, mental retardation and other developmental disorders of the central 
nervous system including Down syndrome, cerebral palsy, neuroskeletal disorders, autonomic 
nervous system disorders, cranial nerve disorders, spinal cord diseases, muscular dystrophy and other 
neuromuscular disorders, peripheral nervous system disorders, dermatomyositis and polymyositis, 
inherited, metabolic, endocrine, and toxic myopathies, myasthenia gravis, periodic paralysis, mental 

25 disorders including mood, anxiety, and schizophrenic disorders, seasonal affective disorder (SAD), 
akathesia, amnesia, catatonia, diabetic neuropathy, hemiplegic migraine, tardive dyskinesia, 
dystonias, paranoid psychoses, postherpetic neuralgia, Tourette's disorder, progressive supranuclear 
palsy, corticobasal degeneration, and familial frontotemporal dementia; a muscle disorder such as 
cardiomyopathy, myocarditis, Duchenne's muscular dystrophy, Becker's muscular dystrophy, 

30 myotonic dystrophy, central core disease, nemaline myopathy, centronuclear myopathy, lipid 

myopathy, mitochondrial myopathy, infectious myositis, polymyositis, dermatomyositis, inclusion 
body myositis, thyrotoxic myopathy, ethanol myopathy, angina, anaphylactic shock, arrhythmias, 
asthma, cardiovascular shock, Cushing's syndrome, hypertension, hypoglycemia, myocardial 
infarction, migraine, pheochromocytoma, and myopathies including encephalopathy, epilepsy, 

35 Kearns-Sayre syndrome, lactic acidosis, myoclonic disorder, ophthalmoplegia, acid maltase 
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deficiency (AMD, also known as Pompe's disease), generalized myotonia, and myotonia congenita; 
an immunological disorder such as acquired immunodeficiency syndrome (AIDS), Addison's disease, 
adult respiratory distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, 
atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, autoimmune 
5 polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED), bronchitis, cholecystitis, contact 
dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, 
episodic lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic 
gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's 
thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, 

10 myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, 
psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic 
anaphylaxis, systemic lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative 
colitis, uveitis, Werner syndrome, complications of cancer, hemodialysis, and extracorporeal 
circulation, viral, bacterial, fungal, parasitic, protozoal, and helminthic infections, and trauma; and a 

15 cell proliferative disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, 
hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal 
hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and cancers including 
adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in 
particular, cancers of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall 

20 bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, 
penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus. 

In another embodiment, a vector capable of expressing TRICK or a fragment or derivative 
thereof may be administered to a subject to treat or prevent a disorder associated with decreased 
expression or activity of TRICH including, but not limited to, those described above. 

25 In a further embodiment, a composition comprising a substantially purified TRICH in 

conjunction with a suitable pharmaceutical carrier may be administered to a subject to treat or prevent 
a disorder associated with decreased expression or activity of TRICH including, but not limited to, 
those provided above. 

In still another embodiment, an agonist which modulates the activity of TRICH may be 

30 administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of TRICfl including, but not limited to, those listed above. 

hi a further embodiment, an antagonist of TRICH may be administered to a subject to treat or 
prevent a disorder associated with increased expression or activity of TRICH. Examples of such 
disorders include, but are not limited to, those transport, neurological, muscle, immunological and 

35 cell proliferative disorders described above. In one aspect, an antibody which specifically binds 
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TRICH may be used directly as an antagonist or indirectly as a targeting or delivery mechanism for 
bringing a pharmaceutical agent to cells or tissues which express TRICH. 

In an additional embodiment, a vector expressing the complement of the polynucleotide 
encoding TRICH may be administered to a subject to treat or prevent a disorder associated with 
5 increased expression or activity of TRICH including, but not limited to, those described above. 

In other embodiments, any of the proteins, antagonists, antibodies, agonists, complementary 
sequences, or vectors of the invention may be administered in combination with other appropriate 
therapeutic agents. Selection of the appropriate agents for use in combination therapy may be made 
by one of ordinary skill in the art, according to conventional pharmaceutical principles. The 

10 combination of therapeutic agents may act synergistically to effect the treatment or prevention of the 
various disorders described above. Using this approach, one may be able to achieve therapeutic 
efficacy with lower dosages of each agent, thus reducing the potential for adverse side effects. 

An antagonist of TRICH may be produced using methods which are generally known in the 
art In particular, purified TRICH may be used to produce antibodies or to screen libraries of 

15 pharmaceutical agents to identify those which specifically bind TRICH Antibodies to TRICH may 
also be generated using methods that are well known in the art Such antibodies may include, but are 
not limited to, polyclonal, monoclonal, chimeric, and single chain antibodies, Fab fragments, and 
fragments produced by a Fab expression library. Neutralizing antibodies (i.e., those which inhibit 
dimer formation) are generally preferred for therapeutic use. 

20 For the production of antibodies, various hosts including goats, rabbits, rats, mice, humans, 

and others may be immunized by injection with TRICH or with any fragment or oligopeptide thereof 
which has immunogenic properties. Depending on the host species, various adjuvants may be used to 
increase immunological response. Such adjuvants include, but are not limited to, Freund's, mineral 
gels such as aluminum hydroxide, and surface active substances such as lysolecithin, pluronic 

25 polyols, polyanions, peptides, oil emulsions, KLH, and dinitrophenol. Among adjuvants used in 
humans, BCG (bacilli Calmette-Guerin) and Corvnebacterium parvum are especially preferable. 

It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to 
TRICH have an amino acid sequence consisting of at least about 5 amino acids, and generally will 
consist of at least about 10 amino acids. It is also preferable that these oligopeptides, peptides, or 

30 fragments are identical to a portion of the amino acid sequence of the natural protein. Short stretches 
of TRICH amino acids may be fused with those of another protein, such as KLH, and antibodies to 
the chimeric molecule may be produced. 

Monoclonal antibodies to TRICH may be prepared using any technique which provides for 
the production of antibody molecules by continuous cell lines in culture. These include, but are not 

35 limited to, the hybridoma technique, the human B-cell hybridoma technique, and the EB V-hybridoma 



52 



WO 02/40541 



PCT/US01/46055 



technique. (See, e.g., Kohler, G. et al. (1975) Nature 256:495-497; Kozbor, D. et al. (1985) J. 

Immunol. Methods 81:31-42; Cote, RJ. et al. (1983) Proc. Natl. Acad. Sci. USA 80:2026-2030; and 

Cole, S.P. et al. (1984) Mol. Cell Biol. 62: 109-120.) 

In addition, techniques developed for the production of "chimeric antibodies," such as the 
5 splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate 

antigen specificity and biological activity, can be used. (See, e.g., Morrison, S.L. et aL (1984) Proc. 

Natl. Acad. Sci. USA 81:6851-6855; Neuberger, M.S. et aL (1984) Nature 312:604-608; and Takeda, 

S. et aL (1985) Nature 314:452-454.) Alternatively, techniques described for the production of single 

chain antibodies may be adapted, using methods known in the art, to produce TRICH-specific single 
10 chain antibodies. Antibodies with related specificity, but of distinct idiotypic composition, may be 

generated by chain shuffling from random combinatorial immunoglobulin libraries. (See, e.g., 

Burton, DJL (1991) Proc. Natl. Acad. Sci. USA 88:10134-10137.) 

Antibodies may also be produced by inducing in vivo production in the lymphocyte 

population or by screening immunoglobulin libraries or panels of highly specific binding reagents as 
15 disclosed in the literature. (See, e.g., Orlandi, R. et al. (1989) Proc. NatL Acad. Sci. USA 

86:3833-3837; Winter, G. et al. (1991) Nature 349:293-299.) 

Antibody fragments which contain specific binding sites for TRICH may also be generated. 

For example, such fragments include, but are not limited to, F(ab r ) 2 fragments produced by pepsin 

digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of 
20 the F(ab*)2 fragments. Alternatively, Fab expression libraries may be constructed to allow rapid and 

easy identification of monoclonal Fab fragments with the desired specificity. (See, e.g., Huse, W.D. 

et al. (1989) Science 246:1275-1281.) 

Various immunoassays may be used for screening to identify antibodies having the desired 

specificity. Numerous protocols for competitive binding or immunoradiometric assays using either 
25 polyclonal or monoclonal antibodies with established specificities are well known in the art Such 

immunoassays typically involve the measurement of complex formation between TRICH and its 

specific antibody. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies 

reactive to two non-interfering TRICH epitopes is generally used, but a competitive binding assay 

may also be employed (Pound, supra) . 
30 Various methods such as Scatchard analysis in conjunction with radioimmunoassay 

techniques may be used to assess the affinity of antibodies for TRICH. Affinity is expressed as an 

association constant, K^, which is defined as the molar concentration of TRICH-antibody complex 

divided by the molar concentrations of free antigen and. free antibody under equilibrium conditions. 

The Ka determined for a preparation of polyclonal antibodies, which are heterogeneous in their 
35 affinities for multiple TRICH epitopes, represents the average affinity, or avidity, of the antibodies 
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for TRICH. The K,, determined for a preparation of monoclonal antibodies, which are monospecific 
for a particular TRICH epitope, represents a true measure of affinity. High-affinity antibody 
preparations with K, ranging from about 10? to 10 12 L/mole are preferred for use in immunoassays in 
which the TRICH-antibody complex must withstand rigorous manipulations. Low-affinity antibody 
5 preparations with 1^ ranging from about 10 6 to 10 7 I7mole are preferred for use in 

immunopurification and similar procedures which ultimately require dissociation of TRICH, 
preferably in active form, from the antibody (Catty, D. (1988) Antibodies, Volimift Tr A Practical 
Approach. IRL Press, Washington DC; Liddell, JJB. and A Cryer (1991) A Practical Guide to 
Monoclonal Antibodies. John Wiley & Sons, New York NY). 

10 The titer and avidity of polyclonal antibody preparations may be further evaluated to 

determine the quality and suitability of such preparations for certain downstream applications. For 
example, a polyclonal antibody preparation containing at least 1-2 mg specific antibody/ml, 
preferably 5-10 mg specific antibody/ml, is generally employed in procedures requiring precipitation 
of TRICH-antibody complexes. Procedures for evaluating antibody specificity, titer, and avidity, and 

15 guidelines for antibody quality and usage in various applications, are generally available. (See, e.g., 
Catty, supra, and Coligan et al. supra .) 

In another embodiment of the invention, the polynucleotides encoding TRICH, or any 
fragment or complement thereof, may be used for therapeutic purposes. In one aspect, modifications 
of gene expression can be achieved by designing complementary sequences or antisense molecules 

20 (DNA, RNA, PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene 
encoding TRICH. Such technology is well known in the art, and antisense oligonucleotides or larger 
fragments can be designed from various locations along the coding or control regions of sequences 
encoding TRICH. (See, e.g., Agrawal, S., ed. (1996) Antisense Therapeutics. Humana Press Inc., 
TotawaNJ.) 

25 In therapeutic use, any gene delivery system suitable for introduction of the antisense 

sequences into appropriate target cells can be used. Antisense sequences can be delivered 
intracellularly in the form of an expression plasnrid which, upon transcription, produces a sequence 
complementary to at least a portion of the cellular sequence encoding the target protein. (See, e.g., 
Slater, J.E. et al. (1998) J. Allergy Clin. Immunol. 102(3):469-475; and Scanlon, KJ. et al. (1995) 

30 9(13): 1288-1296.) Antisense sequences can also be introduced intracellularly through the use of viral 
vectors, such as retrovirus and adeno-associated virus vectors. (See, e.g., Miller, A.D. (1990) Blood 
76:271; Ausubel, supra: Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63(3):323-347.) Other 
gene delivery mechanisms include liposome-derived systems, artificial viral envelopes, and other 
systems known in the art (See, e.g., Rossi, JJ. (1995) Br. Med. Bull. 51(l):217-225; Boado, RJ. et 

35 al. (1998) J. Pharm. Sci. 87(11):1308-1315; and Morris, M.C. et al. (1997) Nucleic Acids Res. 
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25(14):2730-2736.) 

Iq another embodiment of the invention, polynucleotides encoding TRICH may be used for 
somatic or germline gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency 
(e.g., in the cases of severe combined immunodeficiency (SCID)-X1 disease characterized by X- 
5 linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined 
immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency 
(Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C. et al. (1995) Science 270:470-475), 
cystic fibrosis (Zahner, J. et al. (1993) Cell 75:207-216; Crystal, R.G. et al. (1995) Hum. Gene 
Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:667-703), thalassamias, familial 

10 hypercholesterolemia, and hemophilia resulting from Factor VDI or Factor IX deficiencies (Crystal, 
R.G. (1995) Science 270:404-410; Verma, LM. and N. Somia (1997) Nature 389:239-242)), (ii) 
express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated 
cell proliferation), or (iii) express a protein which affords protection against intracellular parasites 
(e.g., against human retroviruses, such as human immunodeficiency virus (HIV) (Baltimore, D. 

15 (1988) Nature 335:395-396; Poeschla, R et al. (1996) Proc. Natl. Acad. Sci. USA. 93:11395-11399), 
hepatitis B or C virus (HBV, HCV); fungal parasites, such as Candida albicans and Paracoccidioides 
brasiliensis: and protozoan parasites such as Plasmodium falciparum and Trypanosoma cruad) . In the 
case where a genetic deficiency in TRICH expression or regulation causes disease, the expression of 
TRICH from an appropriate population of transduced cells may alleviate the clinical manifestations 

20 caused by the genetic deficiency. 

In a further embodiment of the invention, diseases or disorders caused by deficiencies in 
TRICH are treated by constructing mammalian expression vectors encoding TRICH and introducing 
these vectors by mechanical means into TRICH-deficient cells. Mechanical transfer technologies for 
use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) 

25 ballistic gold particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene 
transfer, and (v) the use of DNA transposons (Morgan, R.A. and W.F. Anderson (1993) Annu. Rev. 
Biochem. 62:191-217; Ivies, Z. (1997) Cell 91:501-510; Boulay, J-L. andH. Recipon (1998) Out. 
Opin. Biotechnol. 9:445-450). 

Expression vectors that may be effective for the expression of TRICH include, but are not 

30 limited to, the PCDNA 3.1, EPITAG, PRCCMV2, PREP, PVAX, PCR2-TOPOTA vectors 

(Invitrogen, Carlsbad CA), PCMV-SCRIPT, PCMV-TAG, PEGSHXPERV (Stratagene, La Jolla CA), 
and PTET-OFF, PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Qontech, Palo Alto CA). TRICH 
may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), 
Rous sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or 0-actin genes), (ii) an inducible 

35 promoter (e.g., the tetracycline-regulated promoter (Gossen, M. and H. Bujard (1992) Proc. NatL 
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Acad. Sci. USA 89:5547-5551; Gossen, M. et al. (1995) Science 268:1766-1769; Rossi, RM.V. and 
H.M. Blau (1998) Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REX plasmid 
(Invitrogen)); the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; 
Invitrogen); the FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible promoter 
5 (Rossi, F.M.V. and ELM. Blau, supra) ), or (iii) a tissue-specific promoter or the native promoter of the 
endogenous gene encoding TRICH from a normal individual. 

Commercially available liposome transformation kits (e.g., the PERFECT LIPID 
TRANSFECTION KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize experimental 

10 parameters. In the alternative, transformation is performed using the calcium phosphate method 
(Graham, F.L. and A. J. Eb (1973) Virology 52:456-467), or by electroporation (Neumann, E. et aL 
(1982) EMBO J. 1:841-845). The introduction of DNA to primary cells requires modification of 
these standardized mammalian transfection protocols. 

In another embodiment of the invention, diseases or disorders caused by genetic defects with 

15 respect to TRICH expression are treated by constructing a retrovirus vector consisting of (i) the 

polynucleotide encoding TRICH under the control of an independent promoter or the retrovirus long 
terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive 
element (RRE) along with additional retrovirus cw-acting RNA sequences and coding sequences 
required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PFBNEO) are 

20 commercially available (Stratagene) and are based on published data (Riviere, I. et al. (1995) Proc. 
Natl. Acad. Sci. USA 92:6733-6737), incorporated by reference herein. The vector is propagated in 
an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropism for 
receptors on the target cells or a promiscuous envelope protein such as VSVg (Armentano, D. et al. 
(1987) J. ViroL 61:1647-1650; Bender, M.A. et al. (1987) J. Virol. 61:1639-1646; Adam, M.A. and 

25 A.D. Miller (1988) J. ViroL 62:3802-3806; Dull, T. et al. (1998) J. ViroL 72:8463-8471; Zufferey, R. 
et al. (1998) J. Virol. 72:9873-9880). U.S. Patent No. 5,910,434 to Rigg ('Method for obtaining 
retrovirus packaging cell lines producing high transducing efficiency retroviral supernatant") 
discloses a method for obtaining retrovirus packaging cell lines and is hereby incorporated by 
reference. Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4 + T- 

30 cells), and the return of transduced cells to a patient are procedures well known to persons skilled in 
the art of gene therapy and have been well documented (Ranga, U. et al. (1997) J. ViroL 71:7020- 
7029; Bauer, G. et al. (1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J. Virol. 71:4707-4716; 
Ranga, U. et al. (1998) Proc. Natl. Acad. Sci. USA 95:1201-1206; Su, L. (1997) Blood 89:2283- 
2290). 

35 In the alternative, an adenovirus-based gene therapy delivery system is used to deliver 
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polynucleotides encoding TRICH to cells which have one or more genetic abnormalities with respect 
to the expression of TRICH. The construction and packaging of adenovirus-based vectors are well 
known to those with ordinary skill in the art Replication defective adenovirus vectors have proven to 
be versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas 
5 (Csete, MJB. et al. (1995) Transplantation 27:263-268). Potentially useful adenoviral vectors are 
described in U.S. Patent No. 5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), hereby 
incorporated by reference. For adenoviral vectors, see also Antinozzi, P.A. et al. (1999) Annu. Rev. 
Nutr. 19:511-544 and Verma, LM. and N. Somia (1997) Nature 18:389:239-242, both incorporated by 
reference herein. 

10 In another alternative, a herpes-based, gene therapy delivery system is used to deliver 

polynucleotides encoding TRICH to target cells which have one or more genetic abnormalities with 
respect to the expression of TRICH The use of herpes simplex virus (HSV)-based vectors may be 
especially valuable for introducing TRICH to cells of the central nervous system, for which HSV has 
a tropism. The construction and packaging of herpes-based vectors are well known to those with 

15 ordinary skill in the art A replication-competent herpes simplex virus (HSV) type 1-based vector has 
been used to deliver a reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye Res. 
169:385-395). The construction of a HSV-1 virus vector has also been disclosed in detail in U.S. 
Patent No. 5,804,413 to DeLuca ("Herpes simplex virus strains for gene transfer"), which is hereby 
incorporated by reference. U.S. Patent No. 5,804,413 teaches the use of recombinant HSV d92 which 

20 consists of a genome containing at least one exogenous gene to be transferred to a cell under the 
control of the appropriate promoter for purposes including human gene therapy* Also taught by this 
patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and ICP22. 
For HSV vectors, see also Goins, WJF. et al. (1999) J. Virol. 73:519-532 and Xu, H. et al. (1994) 
Dev. Biol. 163:152-161, hereby incorporated by reference. The manipulation of cloned herpesvirus 

25 sequences, the generation of recombinant virus following the transfection of multiple plasmids 
containing different segments of the large herpesvirus genomes, the growth and propagation of 
herpesvirus, and the infection of cells with herpesvirus are techniques well known to those of 
ordinary skill in the art. 

In another alternative, an alphavirus (positive, single-stranded RNA virus) vector is used to 

30 deliver polynucleotides encoding TRICH to target cells. The biology of the prototypic alphavirus, 
Semliki Forest Virus (SFV), has been studied extensively and gene transfer vectors have been based 
on the SFV genome (Garoff, H and KL-J. Li (1998) Curr. Opin. Biotechnol. 9:464469). During 
alphavirus RNA replication, a subgenomic RNA is generated that normally encodes the viral capsid 
proteins. This subgenomic RNA replicates to higher levels than the full length genomic RNA, 

35 resulting in the overproduction of capsid proteins relative to the viral proteins with enzymatic activity 
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(e.g., protease and polymerase). Similarly, inserting the coding sequence for TRICH into the 
alphavirus genome in place of the capsid-coding region results in the production of a large number of 
TRICH-coding RNAs and the synthesis of high levels of TRICH in vector transduced cells. While 
alphavirus infection is typically associated with cell lysis within a few days, the ability to establish a 
5 persistent infection in hamster normal kidney cells (BHK-21) with a variant of Sindbis virus (SIN) 
indicates that the lytic replication of alphaviruses can be altered to suit the needs of the gene therapy 
application (Dryga, S.A. et al. (1997) Virology 228:74-83). The wide host range of alphaviruses will 
allow the introduction of TRICH into a variety of cell types. The specific transduction of a subset of 
cells in a population may require the sorting of cells prior to transduction. The methods of 
10 manipulating infectious cDNA clones of alphaviruses, performing alphavirus cDNA and RNA 

transfections, and performing alphavirus infections, are well known to those with ordinary skill in the 
art 

Oligonucleotides derived from the transcription initiation site, e.g., between about positions 
-10 and +10 from the start site, may also be employed to inhibit gene expression. Similarly, 

15 inhibition can be achieved using triple helix base-pairing methodology. Triple helix pairing is useful 
because it causes inhibition of the ability of the double helix to open sufficiently for the binding of 
polymerases, transcription factors, or regulatory molecules. Recent therapeutic advances using 
triplex DNA have been described in the literature. (See, e.g., Gee, JJB. et al. (1994) in Huber, BJB. 
and B.L Carr, Molecular and Immunologic Approaches , Futura Publishing, Mt. Kisco NY, pp. 163- 

20 177.) A complementary sequence or antisense molecule may also be designed to block translation of 
mRNA by preventing the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of 
RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme 
molecule to complementary target RNA, followed by endonucleolytic cleavage. For example, 

25 engineered hammerhead motif ribozyme molecules may specifically and efficiently catalyze 
endonucleolytic cleavage of sequences encoding TRICH. 

Specific ribozyme cleavage sites within any potential RNA target are initially identified by 
scanning the target molecule for ribozyme cleavage sites, including the following sequences: GUA, 
GUU, and GUC. Once identified, short RNA sequences of between 15 and 20 ribonucleotides, 

30 corresponding to the region of the target gene containing the cleavage site, may be evaluated for 
secondary structural features which may render the oligonucleotide inoperable. The suitability of 
candidate targets may also be evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 

Complementary ribonucleic acid molecules and ribozymes of the invention may be prepared 

35 by any method known in the art for the synthesis of nucleic acid molecules. These include techniques 
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for chemically synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. 
Alternatively, RNA molecules may be generated by in vitro and in vivo transcription of DNA 
sequences encoding TRICH. Such DNA sequences may be incorporated into a wide variety of 
vectors with suitable RNA polymerase promoters such as 17 or SP6. Alternatively, these cDNA 
5 constructs that synthesize complementary RNA, constitutively or inducibly, can be introduced into 
cell lines, cells, or tissues. 

RNA molecules may be modified to increase intracellular stability and half-life. Possible 
modifications include, but are not limited to, the addition of flanking sequences at the 5' and/or 3* 
ends of the molecule, or the use of phosphorothioate or T O-methyl rather than phosphodiesterase 
10 linkages within the backbone of the molecule. This concept is inherent in the production of PNAs 
and can be extended in all of these molecules by the inclusion of nontraditional bases such as inosine, 
queosine, and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, 
cytidine, guanine, thymine, and uridine which are not as easily recognized by endogenous 
endonucleases. 

15 An additional embodiment of the invention encompasses a method for screening for a 

compound which is effective in altering expression of a polynucleotide encoding TRICH. 
Compounds which may be effective in altering expression of a specific polynucleotide may include, 
but are not limited to, oligonucleotides, antisense oligonucleotides, triple helix-forming 
oligonucleotides, transcription factors and other polypeptide transcriptional regulators, and non- 
20 macromolecular chemical entities which are capable of interacting with specific polynucleotide 

sequences. Effective compounds may alter polynucleotide expression by acting as either inhibitors or 
promoters of polynucleotide expression. Thus, in the treatment of disorders associated with increased 
TRICH expression or activity, a compound which specifically inhibits expression of the 
polynucleotide encoding TRICH may be therapeutically useful, and in the treatment of disorders 
25 associated with decreased TRICH expression or activity, a compound which specifically promotes 
expression of the polynucleotide encoding TRICH may be therapeutically useful. 

At least one, and up to a plurality, of test compounds may be screened for effectiveness in 
altering expression of a specific polynucleotide. A test compound may be obtained by any method 
commonly known in the art, including chemical modification of a compound known to be effective in 
30 altering polynucleotide expression; selection from an existing, commercially-available or proprietary 
library of naturally-occurring or non-natural chemical compounds; rational design of a compound 
based on chemical and/or structural properties of the target polynucleotide; and selection from a 
library of chemical compounds created combinatorially or randomly. A sample comprising a 
polynucleotide encoding TRICH is exposed to at least one test compound thus obtained. The sample 
35 may comprise, for example, an intact or permeabilized cell, or an in vitro cell-free or reconstituted 
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biochemical system. Alterations in the expression of a polynucleotide encoding TRICH are assayed 
by any method commonly known in the art Typically, the expression of a specific nucleotide is 
detected by hybridization with a probe having a nucleotide sequence complementary to the sequence 
of the polynucleotide encoding TRICH. The amount of hybridization may be quantified, thus 
5 forming the basis for a comparison of the expression of the polynucleotide both with and without 
exposure to one or more test compounds. Detection of a change in the expression of a polynucleotide 
exposed to a test compound indicates that the test compound is effective in altering the expression of 
the polynucleotide. A screen for a compound effective in altering expression of a specific 
polynucleotide can be carried out, for example, using a Schizosaccharomvces pombe gene expression 

10 system (Atkins, D. et al. (1999) U.S. Patent No. 5,932,435; Amdt, G.M. et al. (2000) Nucleic Acids 
Res. 28:E15) or a human cell line such as HeLa cell (Clarke, M.L. et al. (2000) Biochem. Biophys. 
Res. Commun. 268:8-13). A particular embodiment of the present invention involves screening a 
combinatorial library of oligonucleotides (such as deoxyribonucleotides, ribonucleotides, peptide 
nucleic acids, and modified oligonucleotides) for antisense activity against a specific polynucleotide 

15 sequence (Bruice, T.W. et al. (1997) U.S. Patent No. 5,686,242; Bruice, T. W. et al. (2000) U.S. 
Patent No. 6,022,691). 

Many methods for introducing vectors into cells or tissues are available and equally suitable 
for use in vivo, in vitro , and ex vivo . For ex vivo therapy, vectors may be introduced into stem cells 
taken from the patient and clonally propagated for autologous transplant back into that same patient 
20 Delivery by transfection, by liposome injections, or by polycatiomc amino polymers may be achieved 
using methods which are well known in the art (See, e.g,* Goldman, C.K. et al. (1997) Nat 
BiotechnoL 15:462-466.) 

Any of the therapeutic methods described above may be applied to any subject in need of 
such therapy, including, for example, mammals such as humans, dogs, cats, cows, horses, rabbits, and 
25 monkeys. 

An additional embodiment of the invention relates to the administration of a composition 
which generally comprises an active ingredient formulated with a pharmaceutical^ acceptable 
excipient Excipients may include, for example, sugars, starches; celluloses, gums, and proteins. 
Various formulations are commonly known and are thoroughly discussed in the latest edition of 

30 ttftmitiptrm 's Pharmaceutical Sciences (Maack Publishing, Easton PA). Such compositions may 

consist of TRICH, antibodies to TRICH, and mimetics, agonists, antagonists, or inhibitors of TRICH. 

The compositions utilized in this invention may be administered by any number of routes 
including, but not limited to, oral, intravenous, intramuscular, intra-arterial, intramedullary, 
intrathecal, intraventricular, pulmonary, transdermal, subcutaneous, intraperitoneal, intranasal, 

35 enteral, topical, sublingual, or rectal means. 

60 



WO 02/40541 



PCT/US01/46055 



Compositions for pulmonary administration may be prepared in liquid or dry powder form. 
These compositions are generally aerosolized immediately prior to inhalation by the patient. In the 
case of small molecules (e.g. traditional low molecular weight organic drugs), aerosol delivery of 
fast-acting formulations is well-known in the art. In the case of macromolecules (e.g. larger peptides 
5 and proteins), recent developments in the field of pulmonary delivery via the alveolar region of the 
lung have enabled the practical delivery of drugs such as insulin to blood circulation (see, e.g M Patton, 
J.S. et al., U.S. Patent No. 5,997,848). Pulmonary delivery has the advantage of administration 
without needle injection, and obviates the need for potentially toxic penetration enhancers. 

Compositions suitable for use in the invention include compositions wherein the active 

10 ingredients are contained in an effective amount to achieve the intended purpose. The determination 
of an effective dose is well within the capability of those skilled in the art 

Specialized forms of compositions may be prepared for direct intracellular delivery of 
macromolecules comprising TRICH or fragments thereof. For example, liposome preparations 
containing a cell-impermeable macromolecule may promote cell fusion and intracellular delivery of 

15 the macromolecule. Alternatively, TRICH of a fragment thereof may be joined to a short cationic N- 
terminal portion from the HIV Tat-1 protein. Fusion proteins thus generated have been found to 
transduce into the cells of all tissues, including the brain, in a mouse model system (Schwarze, S.R. et 
al. (1999) Science 285:1569-1572). 

For any compound, the therapeutically effective dose can be estimated initially either in cell 

20 culture assays, e.g., of neoplastic cells, or in animal models such as mice, rats, rabbits, dogs, 

monkeys, or pigs. An animal model may also be used to determine the appropriate concentration 
range and route of administration. Such information can then be used to determine useful doses and 
routes for administration in humans. 

A therapeutically effective dose refers to that amount of active ingredient, for example 

25 TRICH or fragments thereof, antibodies of TRICH, and agonists, antagonists or inhibitors of TRICH, 
which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be determined 
by standard pharmaceutical procedures in cell cultures or with experimental animals, such as by 
calculating the ED^ (the dose therapeutically effective in 50% of the population) or LD50 (the dose 
lethal to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is the 

30 therapeutic index, which can be expressed as the LDjq/EI^ ratio. Compositions which exhibit large 
therapeutic indices are preferred. The data obtained from cell culture assays and animal studies are 
used to formulate a range of dosage for human use. The dosage contained in such compositions is 
preferably within a range of circulating concentrations that includes the ED 50 with little or no toxicity. 
The dosage varies within this range depending upon the dosage form employed, the sensitivity of the 

35 patient, and the route of administration. 



61 



WO 02/40541 



PCT/US01/46055 



The exact dosage will be determined by the practitioner, in light of factors related to the 
subject requiring treatment. Dosage and administration are adjusted to provide sufficient levels of the 
active moiety or to maintain the desired effect. Factors which may be taken into account include the 
severity of the disease state, the general health of the subject, the age, weight, and gender of the 
5 subject, time and frequency of administration, drug combination(s), reaction sensitivities, and 

response to therapy. Long-acting compositions may be administered every 3 to 4 days, every week, 
or biweekly depending on the half-life and clearance rate of the particular formulation. 

Normal dosage amounts may vary from about 0.1 jzg to 100,000 /ig, up to a total dose of 
about 1 gram, depending upon the route of administration. Guidance as to particular dosages and 
10 methods of delivery is provided in the literature and generally available to practitioners in the art 
Those skilled in the art will employ different formulations for nucleotides than for proteins or their 
inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells, 
conditions, locations, etc. 
DIAGNOSTICS 

15 In another embodiment, antibodies which specifically bind TRICH may be used for the 

diagnosis of disorders characterized by expression of TRICH, or in assays to monitor patients being 
treated with TRICH or agonists, antagonists, or inhibitors of TRICH. Antibodies useful for 
diagnostic purposes may be prepared in the same manner as described above for therapeutics. 
Diagnostic assays for TRICH include methods which utilize the antibody and a label to detect TRICH 

20 in human body fluids or in extracts of cells or tissues. The antibodies may be used with or without 
modification, and may be labeled by covalent or non-covalent attachment of a reporter molecule. A 
wide variety of reporter molecules, several of which are described above, are known in the art and 
may be used. 

A variety of protocols for measuring TRICH, including ELISAs, RIAs, and FACS, are known 
25 in the art and provide a basis for diagnosing altered or abnormal levels of TRICH expression. Normal 
or standard values for TRICH expression are established by combining body fluids or cell extracts 
taken from normal mammalian subjects, for example, human subjects, with antibodies to TRICH 
under conditions suitable for complex formation. The amount of standard complex formation may be 
quantitated by various methods, such as photometric means. Quantities of TRICH expressed in 
30 subject, control, and disease samples from biopsied tissues are compared with the standard values. 
Deviation between standard and subject values establishes the parameters for diagnosing disease. 

In another embodiment of the invention, the polynucleotides encoding TRICH may be used 
for diagnostic purposes. The polynucleotides which may be used include oligonucleotide sequences, 
complementary RNA and DNA molecules, and PNAs. The polynucleotides may be used to detect 
35 and quantify gene expression in biopsied tissues in which expression of TRICH may be correlated 
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with disease. The diagnostic assay may be used to determine absence, presence, and excess 
expression of TRICH, and to monitor regulation of TRICH levels during therapeutic intervention. 

In one aspect, hybridization with PGR probes which are capable of detecting polynucleotide 
sequences, including genomic sequences, encoding TRICH or closely related molecules may be used 
5 to identify nucleic acid sequences which encode TRICH. The specificity of the probe, whether it is 
made from a highly specific region, e.g., the 5' regulatory region, or from a less specific region, e.g., a 
conserved motif, and the stringency of the hybridization or amplification will determine whether the 
probe identifies only naturally occurring sequences encoding TRICH, allelic variants, or related 
sequences. 

10 Probes may also be used for the detection of related sequences, and may have at least 50% 

sequence identity to any of the TRICH encoding sequences. The hybridization probes of the subject 
invention may be DNA or RNA and may be derived from the sequence of SEQ ID NO:21-40 or from 
genomic sequences including promoters, enhancers, and introns of the TRICH gene. 

Means for producing specific hybridization probes for DNAs encoding TRICH include the 

15 cloning of polynucleotide sequences encoding TRICH or TRICH derivatives into vectors for the 

production of mRNA probes. Such vectors are known in the art, are commercially available, and may 
be used to synthesize RNA probes in vitro by means of the addition of the appropriate RNA 
polymerases and the appropriate labeled nucleotides. Hybridization probes may be labeled by a 
variety of reporter groups, for example, by radionuclides such as 32 P or 35 S, or by enzymatic labels, 

20 such as alkaline phosphatase coupled to the probe via avidin/biotin coupling systems, and the like. 

Polynucleotide sequences encoding TRICH may be used for the diagnosis of disorders 
associated with expression of TRICH. Examples of such disorders include, but are not limited to, a 
transport disorder such as akinesia, amyotrophic lateral sclerosis, ataxia telangiectasia, cystic fibrosis, 
Becker's muscular dystrophy, Bell's palsy, Charcot-Marie Tooth disease, diabetes mellitus, diabetes 

25 insipidus, diabetic neuropathy, Duchenne muscular dystrophy, hyperkalemic periodic paralysis, 

normokalemic periodic paralysis, Parkinson's disease, malignant hyperthermia, multidrug resistance, 
myasthenia gravis, myotonic dystrophy, catatonia, tardive dyskinesia, dystonias, peripheral 
neuropathy, cerebral neoplasms, prostate cancer, cardiac disorders associated with transport, e.g., 
angina, bradyanythmia, tachyarrythmia, hypertension, Long QT syndrome, myocarditis, 

30 cardiomyopathy, nemaline myopathy, centronuclear myopathy, lipid myopathy, mitochondrial 
myopathy, thyrotoxic myopathy, ethanol myopathy, dermatomyositis, inclusion body myositis, 
infectious myositis, polymyositis, neurological disorders associated with transport, e.g., Alzheimer's 
disease, amnesia, bipolar disorder, dementia, depression, epilepsy, Tourette's disorder, paranoid 
psychoses, and schizophrenia, and other disorders associated with transport, e.g., neurofibromatosis, 

35 postherpetic neuralgia, trigeminal neuropathy, sarcoidosis, sickle cell anemia, Wilson's disease, 
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cataracts, infertility, pulmonary artery stenosis, sensorineural autosomal deafness, hyperglycemia, 
hypoglycemia, Grave's disease, goiter, Cushing's disease, Addison's disease, glucose-galactose 
malabsorption syndrome, glycogen storage disease, hypercholesterolemia, adrenoleukodystrophy, 
Zellweger syndrome, Menkes disease, occipital hom syndrome, von Gierke disease, 
5 pseudohypoaldosteronism type 1, Liddle's syndrome, cystinuria, iminoglycinuria, Hartup disease, 
Fanconi disease, and Bartter syndrome; a neurological disorder such as epilepsy, ischemic 
cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease, Pick's disease, 
Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, amyotrophic 
lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, retinitis 

10 pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating diseases, bacterial and 
viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial 
thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases 
including kuru, Creutzfeldt-Jakob disease, and Gerstrnann-Straussler-Scheinker syndrome, fatal 
familial insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, 

15 tuberous sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental 
retardation and other developmental disorders of the central nervous system including Down 
syndrome, cerebral palsy, neuroskeletal disorders, autonomic nervous system disorders, cranial nerve 
disorders, spinal cord diseases, muscular dystrophy and other neuromuscular disorders, peripheral 
nervous system disorders, dermatomyositis and polymyositis, inherited, metabolic, endocrine, and 

20 toxic myopathies, myasthenia gravis, periodic paralysis, mental disorders including mood, anxiety, 
and schizophrenic disorders, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, 
diabetic neuropathy, hemiplegic migraine, tardive dyskinesia, dystonias, paranoid psychoses, 
postheipetic neuralgia, Tourette's disorder, progressive supranuclear palsy, corticobasal 
degeneration, and familial frontotemporal dementia; a muscle disorder such as cardiomyopathy, 

25 myocarditis, Duchenne' s muscular dystrophy, Becker' s muscular dystrophy, myotonic dystrophy, 
central core disease, nemaline myopathy, centronuclear myopathy, lipid myopathy, mitochondrial 
myopathy, infectious myositis, polymyositis, dermatomyositis, inclusion body myositis, thyrotoxic 
myopathy, ethanol myopathy, angina, anaphylactic shock, arrhythmias, asthma, cardiovascular shock, 
Cushing's syndrome, hypertension, hypoglycemia, myocardial infarction, migraine, 

30 pheochromocytoma, and myopathies including encephalopathy, epilepsy, Kearns-Sayre syndrome, 
lactic acidosis, myoclonic disorder, ophthalmoplegia, acid maltase deficiency (AMD, also known as 
Pompe's disease), generalized myotonia, and myotonia congenita; an immunological disorder such as 
acquired immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress 
syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, 

35 autoimmune hemolytic anemia, autoimmune thyroiditis, autoimmune polyendocrinopathy- 
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candidiasis-ectodermal dystrophy (APECED), bronchitis, cholecystitis, contact dermatitis, Crohn's 
disease, atopic dermatitis, dennatomyositis, diabetes mellitus, emphysema, episodic lymphopenia 
with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, 
glomerulonephritis, Goodpasture's syndrome, gout, Graves* disease, Hashimoto's thyroiditis, 
5 hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or 
pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's 
syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, systemic 
lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, 
Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, viral, 

10 bacterial, fungal, parasitic, protozoal, and helminthic infections, and trauma; and a cell proliferative 
disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed 
connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, 
polycythemia vera, psoriasis, primary thrombocythenria, and cancers including adenocarcinoma, 
leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of 

15 the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, 

gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, 
salivary glands, skin, spleen, testis, thymus, thyroid, and uterus. The polynucleotide sequences 
encoding TRICH may be used in Southern or northern analysis, dot blot, or other membrane-based 
technologies; in PCR technologies; in dipstick, pin, and multifoimat ELISA-like assays; and in 

20 microarrays utilizing fluids or tissues from patients to detect altered TRICH expression. Such 
qualitative or quantitative methods are well known in the art. 

hi a particular aspect, the nucleotide sequences encoding TRICH may be useful in assays that 
detect the presence of associated disorders, particularly those mentioned above. The nucleotide 
sequences encoding TRICH may be labeled by standard methods and added to a fluid or tissue sample 

25 from a patient under conditions suitable for the formation of hybridization complexes. After a 
suitable incubation period, the sample is washed and the signal is quantified and compared with a 
standard value. If the amount of signal in the patient sample is significantly altered in comparison to 
a control sample then the presence of altered levels of nucleotide sequences encoding TRICH in the 
sample indicates the presence of the associated disorder. Such assays may also be used to evaluate 

30 the efficacy of a particular therapeutic treatment regimen in animal studies, in clinical trials, or to 
monitor the treatment of an individual patient 

In order to provide a basis for the diagnosis of a disorder associated with expression of 
TRICH, a normal or standard profile for expression is established. This may be accomplished by 
combining body fluids or cell extracts taken from normal subjects, either animal or human, with a 

35 sequence, or a fragment thereof, encoding TRICH, under conditions suitable for hybridization or 
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amplification. Standard hybridization may be quantified by comparing the values obtained from 
normal subjects with values from an experiment in which a known amount of a substantially purified 
polynucleotide is used. Standard values obtained in this manner may be compared with values 
obtained from samples from patients who are symptomatic for a disorder. Deviation from standard 
5 values is used to establish the presence of a disorder. 

Once the presence of a disorder is established and a treatment protocol is initiated, 
hybridization assays may be repeated on a regular basis to determine if the level of expression in the 
patient begins to approximate that which is observed in the normal subject The results obtained from 
successive assays may be used to show the efficacy of treatment over a period ranging from several 
10 days to months. 

With respect to cancer, the presence of an abnormal amount of transcript (either under- or 
overexpressed) in biopsied tissue from an individual may indicate a predisposition for the 
development of the disease, or may provide a means for detecting the disease prior to the appearance 
of actual clinical symptoms. A more definitive diagnosis of this type may allow health professionals 

15 to employ preventative measures or aggressive treatment earlier thereby preventing the development 
or further progression of the cancer. 

Additional diagnostic uses for oligonucleotides designed from the sequences encoding 
TRICH may involve the use of PCR. These oligomers may be chemically synthesized, generated 
enzymatically, or produced in vitro . Oligomers will preferably contain a fragment of a polynucleotide 

20 encoding TRICH, or a fragment of a polynucleotide complementary to the polynucleotide encoding 
TRICH, and will be employed under optimized conditions for identification of a specific gene or 
condition. Oligomers may also be employed under less stringent conditions for detection or 
quantification of closely related DNA or RNA sequences. 

In a particular aspect, oligonucleotide primers derived from the polynucleotide sequences 

25 encoding TRICH may be used to detect single nucleotide polymorphisms (SNPs). SNPs are 
substitutions, insertions and deletions that are a frequent cause of inherited or acquired genetic 
disease in humans. Methods of SNP detection include, but are not limited to, single-stranded 
conformation polymorphism (SSCP) and fluorescent SSCP (fSSCP) methods. 3h SSCP, 
oligonucleotide primers derived from the polynucleotide sequences encoding TRICH are used to 

30 amplify DNA using the polymerase chain reaction (PCR). The DNA may be derived, for example, 
from diseased or normal tissue, biopsy samples, bodily fluids, and the like. SNPs in the DNA cause 
differences in the secondary and tertiary structures of PCR products in single-stranded form, and 
these differences are detectable using gel electrophoresis in non-denaturing gels. In fSCCP, the 
oligonucleotide primers are fluorescently labeled, which allows detection of the amplimers in high- 

35 throughput equipment such as DNA sequencing machines. Additionally, sequence database analysis 
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methods, termed in silico SNP (isSNP), are capable of identifying polymorphisms by comparing the 
sequence of individual overlapping DNA fragments which assemble into a common consensus 
sequence. These computer-based methods filter out sequence variations due to laboratory preparation 
of DNA and sequencing errors using statistical models and automated analyses of DNA sequence 
5 chromatograms. In the alternative, SNPs may be detected and characterized by mass spectrometry 
using, for example, the high throughput MASSARRAY system (Sequenom, Inc., San Diego CA). 

Methods which may also be used to quantify the expression of TRICH include radiolabeling 
or biotinylating nucleotides, coamplification of a control nucleic acid, and interpolating results from 
standard curves. (See, e.g., Melby, P.Q et aL (1993) J. Immunol. Methods 159:235-244; Duplaa, C 
10 et aL (1993) Anal. Biochem. 212:229-236.) The speed of quantitation of multiple samples may be 
accelerated by running the assay in a high-throughput format where the oligomer or polynucleotide of 
interest is presented in various dilutions and a spectrophotometric or colorimetric response gives 
rapid quantitation. 

In further embodiments, oligonucleotides or longer fragments derived from any of the 

15 polynucleotide sequences described herein may be used as elements on a microarray. The nricroarray 
can be used in transcript imaging techniques which monitor the relative expression levels of large 
numbers of genes simultaneously as described below: The microarray may also be used to identify 
genetic variants, mutations, and polymorphisms. This information may be used to determine gene 
function, to understand the genetic basis of a disorder, to diagnose a disorder, to monitor 

20 progression/regression of disease as a function of gene expression, and to develop and monitor the 
activities of therapeutic agents in the treatment of disease. In particular, this information may be used 
to develop a pharmacogenomic profile of a patient in order to select the most appropriate and 
effective treatment regimen for that patient. For example, therapeutic agents which are highly 
effective and display the fewest side effects may be selected for a patient based on his/her 

25 pharmacogenomic profile. 

In another embodiment, TRICH, fragments of TRICH, or antibodies specific for TRICH may 
be used as elements on a microarray. The microarray may be used to monitor or measure protein- 
protein interactions, drug-target interactions, and gene expression profiles, as described above. 

A particular embodiment relates to the use of the polynucleotides of the present invention to 

30 generate a transcript image of a tissue or cell type. A transcript image represents the global pattern of 
gene expression by a particular tissue or cell type. Global gene expression patterns are analyzed by 
quantifying the number of expressed genes and their relative abundance under given conditions and at 
a given time. (See Seilhamer et aL, "Comparative Gene Transcript Analysis," U.S. Patent No. 
5,840,484, expressly incorporated by reference herein.) Thus a transcript image may be generated by 

35 hybridizing the polynucleotides of the present invention or their complements to the totality of 
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transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, the 
hybridization takes place in high-throughput format, wherein the polynucleotides of the present 
invention or their complements comprise a subset of a plurality of elements on a microarray. The 
resultant transcript image would provide a profile of gene activity. 
5 Transcript images may be generated using transcripts isolated from tissues, cell lines, 

biopsies, or other biological samples. The transcript image may thus reflect gene expression in vivo, 
as in the case of a tissue or biopsy sample, or in vitro, as in the case of a cell line. 

Transcript images which profile the expression of the polynucleotides of the present 
invention may also be used in conjunction with in vitro model systems and preclinical evaluation of 

10 pharmaceuticals, as well as toxicological testing of industrial and naturally-occurring environmental 
compounds. All compounds induce characteristic gene expression patterns, frequently termed 
molecular fingerprints or toxicant signatures, which are indicative of mechanisms of action and 
toxicity (Nuwaysir, BP. et al. (1999) Mol. Carcinog. 24:153-159; Steiner, S. and N.L, Anderson 
(2000) Toxicol. Lett 1 12-1 13:467-471, expressly incorporated by reference herein). If a test 

15 compound has a signature similar to that of a compound with known toxicity, it is likely to share 

those toxic properties. These fingerprints or signatures are most useful and refined when they contain 
expression information from a large number of genes and gene families. Ideally, a genome-wide 
measurement of expression provides the highest quality signature. Even genes whose expression is 
not altered by any tested compounds are important as well, as the levels of expression of these genes 

20 are used to normalize the rest of the expression data. The normalization procedure is useful for 
comparison of expression data after treatment with different compounds. While the assignment of 
gene function to elements of a toxicant signature aids in interpretation of toxicity mechanisms, 
knowledge of gene function is not necessary for the statistical matching of signatures which leads to 
prediction of toxicity. (See, for example, Press Release 00-02 from the National Institute of 

25 Environmental Health Sciences, released February 29, 2000, available at 

http://www.niehs.nih.gov/oc/news/toxchip.htm) Therefore, it is important and desirable in 
toxicological screening using toxicant signatures to include all expressed gene sequences. 

In one embodiment, the toxicity of a test compound is assessed by treating a biological 
sample containing nucleic acids with the test compound. Nucleic acids that are expressed in the 

30 treated biological sample are hybridized with one or more probes specific to the polynucleotides of 
the present invention, so that transcript levels corresponding to the polynucleotides of the present 
invention may be quantified. The transcript levels in the treated biological sample are compared with 
levels in an untreated biological sample. Differences in the transcript levels between the two samples 
are indicative of a toxic response caused by the test compound in the treated sample. 

35 Another particular embodiment relates to the use of the polypeptide sequences of the present 
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invention to analyze the proteome of a tissue or cell type. The term proteome refers to the global 
pattern of protein expression in a particular tissue or cell type. Each protein component of a 
proteome can be subjected individually to further analysis. Proteome expression patterns, or profiles, 
are analyzed by quantifying the number of expressed proteins and their relative abundance under 
5 given conditions and at a given time. A profile of a cell's proteome may thus be generated by 

separating and analyzing the polypeptides of a particular tissue or cell type, hi one embodiment, the 
separation is achieved using two-dimensional gel electrophoresis, in which proteins from a sample are 
separated by isoelectric focusing in the first dimension, and then according to molecular weight by 
sodium dodecyl sulfate slab gel electrophoresis in the second dimension (Steiner and Anderson, 

10 supra) . The proteins are visualized in the gel as discrete and uniquely positioned spots, typically by 
staining the gel with an agent such as Coomassie Blue or silver or fluorescent stains. The optical 
density of each protein spot is generally proportional to the level of the protein in the sample. The 
optical densities of equivalent^ positioned protein spots from different samples, for example, from 
biological samples either treated or untreated with a test compound or therapeutic agent, are 

15 compared to identify any changes in protein spot density related to the treatment The proteins in the 
spots are partially sequenced using, for example, standard methods employing chemical or enzymatic 
cleavage followied by mass spectrometry. The identity of the protein in a spot may be determined by 
comparing its partial sequence, preferably of at least 5 contiguous amino acid residues, to the 
polypeptide sequences of the present invention. In some cases, further sequence data may be 

20 obtained for definitive protein identification. 

A proteomic profile may also be generated using antibodies specific for TRICH to quantify 
the levels of TRICH expression. In one embodiment, the antibodies are used as elements on a- 
microarray, and protein expression levels are quantified by exposing the microarray to the sample and 
detecting the levels of protein bound to each array element (Lueking, A. et aL (1999) Anal. Biochem. 

25 270:103-111; Mendoze, L.G. et al. (1999) Biotechniques 27:778-788). Detection may be performed 
by a variety of methods known in the art, for example, by reacting the proteins in the sample with a 
thiol- or amino-reactive fluorescent compound and detecting the amount of fluorescence bound at 
each array element 

Toxicant signatures at the proteome level are also useful for toxicological screening, and 
30 should be analyzed in parallel with toxicant signatures at the transcript level. There is a poor 

correlation between transcript and protein abundances for some proteins in some tissues (Anderson, 
N.L. and J. Seilhamer (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be 
useful in the analysis of compounds which do not significantly affect the transcript image, but which 
alter the proteomic profile. In addition, the analysis of transcripts in body fluids is difficult, due to 
35 rapid degradation of mRNA, so proteomic profiling may be more reliable and informative in such 
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cases. 

In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sample containing proteins with the test compound. Proteins that are expressed in the treated 
biological sample are separated so that the amount of each protein can be quantified The amount of 

5 each protein is compared to the amount of the corresponding protein in an untreated biological 
sample. A difference in the amount of protein between the two samples is indicative of a toxic 
response to the test compound in the treated sample. Individual proteins are identified by sequencing 
the amino acid residues of the individual proteins and comparing these partial sequences to the 
polypeptides of the present invention. 

10 In another embodiment, the toxicity of a test compound is assessed by treating a biological 

sample containing proteins with the test compound. Proteins from the biological sample are 
incubated with antibodies specific to the polypeptides of the present invention. The amount of 
protein recognized by the antibodies is quantified. The amount of protein in the treated biological 
sample is compared with the amount in an untreated biological sample. A difference in the amount of 

15 protein between the two samples is indicative of a toxic response to the test compound in the treated 
sample. 

Microarrays may be prepared, used, and analyzed using methods known in the art (See, e.g., 
Brennan, TM. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et aL (1996) Proc. Natl. Acad. Sci. 
USA 93:10614-10619; Baldeschweiler et al. (1995) PCT application W095/251116; Shalon, D. et al. 

20 (1995) PCT application WO95/35505; Heller, RA. et al. (1997) Proc. Natl. Acad. Sci. USA 94:2150- 
2155; and Heller, M.J. et al. (1997) U.S. Patent No. 5,605,662.) Various types of microarrays are 
well known and thoroughly described in DNA Microarrays: A Practical Approach . M. Schena, ed. 
(1999) Oxford University Press, London, hereby expressly incorporated by reference. 

In another embodiment of the invention, nucleic acid sequences encoding TRICH may be 

25 used to generate hybridization probes useful in mapping the naturally occurring genomic sequence. 
Either coding or noncpding sequences may be used, and in some instances, noncoding sequences may 
be preferable over coding sequences. For example, conservation of a coding sequence among 
members of a multi-gene family may potentially cause undesired cross hybridization dining 
chromosomal mapping. The sequences may be mapped to a particular chromosome, to a specific 

30 region of a chromosome, or to artificial chromosome constructions, e.g., human artificial 

chromosomes (HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes 
(BACs), bacterial PI constructions, or single chromosome cDNA libraries. (See, e.g., Harrington, J J. 
et al. (1997) Nat. Genet. 15:345-355; Price, CM. (1993) Blood Rev. 7: 127-134; and Trask, B J. 
(1991) Trends Genet. 7:149-154.) Once mapped, the nucleic acid sequences of the invention may be 

35 used to develop genetic linkage maps, for example, which correlate the inheritance of a disease state 
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with the inheritance of a particular chromosome region or restriction fragment length polymorphism 
(RFLP). (See, for example, Lander, E.S. and D. Botstein (1986) Proc. Natl Acad. Sci. USA 83:7353- 
7357.) 

Fluorescent in situ hybridization (FISH) may be correlated with other physical and genetic 
5 map data. (See, e.g., Heinz-Ulrich, et al. (1995) in Meyers, supra, pp. 965-968.) Examples of genetic 
map data can be found in various scientific journals or at the Online Mendelian Inheritance in Man 
(OMIM) World Wide Web site. Correlation between the location of the gene encoding TRICH on a 
physical map and a specific disorder, or a predisposition to a specific disorder, may help define the 
region of DNA associated with that disorder and thus may further positional cloning efforts. 

10 In situ hybridization of chromosomal preparations and physical mapping techniques, such as 

linkage analysis using established chromosomal markers, may be used for extending genetic maps. 
Often the placement of a gene on the chromosome of another mammalian species, such as mouse, 
may reveal associated markers even if the exact chromosomal locus is not known. This information is 
valuable to investigators searching for disease genes using positional cloning or other gene discovery 

15 techniques. Once the gene or genes responsible for a disease or syndrome have been crudely 

localized by genetic linkage to a particular genomic region, e.g., ataxia-telangiectasia to llq22-23, 
any sequences mapping to that area may represent associated or regulatory genes for further 
investigation. (See, e.g., Gatti, RA et al. (1988) Nature 336:577-580.) The nucleotide sequence of 
the instant invention may also be used to detect differences in the chromosomal location due to 

20 translocation, inversion, etc., among normal, carrier, or affected individuals. 

In another embodiment of the invention, TRICH, its catalytic or immunogenic fragments, or 
oligopeptides thereof can be used for screening libraries of compounds in any of a variety of drug 
screening techniques. The fragment employed in such screening may be free in solution, affixed to a 
solid support, borne on a cell surface, or located intracellularly. The formation of binding complexes 

25 between TRICH and the agent being tested may be measured. 

Another technique for drug screening provides for high throughput screening of compounds 
having suitable binding affinity to the protein of interest (See, e.g., Gey sen, et al. (1984) PCT 
application WO84/03564.) In this method, large numbers of different small test compounds are 
synthesized on a solid substrate. The test compounds are reacted with TRICH, or fragments thereof, 

30 and washed. Bound TRICH is then detected by methods well known in the art. Purified TRICH can 
also be coated directly onto plates for use in the aforementioned drug screening techniques. 
Alternatively, non-neutralizing antibodies can be used to capture the peptide and immobilize it on a 
solid support 

In another embodiment, one may use competitive drug screening assays in which neutralizing 
35 antibodies capable of binding TRICH specifically compete with a test compound for binding TRICH. 



71 



WO 02/40541 



PCT/USO 1/46055 



In this maimer, antibodies can be used to detect the presence of any peptide which shares one or more 
antigenic determinants with TRICK 

In additional embodiments, the nucleotide sequences which encode TRICH may be used in 
any molecular biology techniques that have yet to be developed, provided the new techniques rely on 
5 properties of nucleotide sequences that are currently known, including, but not limited to, such 
properties as the triplet genetic code and specific base pair interactions. 

Without further elaboration, it is believed that one skilled in the art can, using the preceding 
description, utilize the present invention to its fullest extent The following embodiments are, 
therefore, to be construed as merely illustrative, and not limitative of the remainder of the disclosure 
10 in any way whatsoever. 

The disclosures of all patents, applications, and publications mentioned above and below, 
including U.S. Ser. No. 60/243,989, U.S. Ser. No. 60/245,904, U.S. Ser. No. 60/249,661, U.S. Ser. 
No. 60/247,673, U.S. Ser. No. 60/252,232, and U.S. Ser. No. 60/250,790, are hereby expressly 
incorporated by reference. 

15 

EXAMPLES 

I. Construction of cDNA Libraries 

Incyte cDNAs were derived from cDNA libraries described in the LIFESEQ GOLD database 
(Incyte Genomics, Palo Alto CA) and shown in Table 4, column 5. Some tissues were homogenized 

20 and lysed in guanidinium isothiocyanate, while others were homogenized and lysed in phenol or in a 
suitable mixture of denaturants, such as TRIZOL (Life Technologies), a monophasic solution of 
phenol and guanidine isothiocyanate. The resulting lysates were centrifuged over CsCl cushions or 
extracted with chloroform. RNA was precipitated from the lysates with either isopropanol or sodium 
acetate and ethanol, or by other routine methods. 

25 Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA 

purity. In some cases, RNA was treated with DNase. For most libraries, poly(A>4- RNA was isolated 
using oligo d(T>coupled paramagnetic particles (Promega), OLIGOTEX latex particles (QIAGEN, 
Chatsworth CA), or an OLIGOTEX mRNA purification kit (QIAGEN). Alternatively, RNA was 
isolated directly from tissue lysates using other RNA isolation kits, e.g., the POLY(A)PURE mRNA 

30 purification kit (Ambion, Austin TX). 

In some cases, Stratagene was provided with RNA and constructed the corresponding cDNA 
libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the UNIZAP 
vector system (Stratagene) or SUPERSCRIPT plasmid system (Life Technologies), using the 
recommended procedures or similar methods known in the art (See, e.g., Ausubel, 1997, supra, units 

35 5.1-6.6.) Reverse transcription was initiated using oligo d(T) or random primers. Synthetic 
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oligonucleotide adapters were ligated to double stranded cDNA, and the cDNA was digested with the 
appropriate restriction enzyme or enzymes. For most libraries, the cDNA was size-selected (300- 
1000 bp) using SEPHACRYL S1000, SEPHAROSE CL2B, or SEPHAROSE CL4B column 
chromatography (Amersham Pharmacia Biotech) or preparative agarose gel electrophoresis. cDNAs 
5 were ligated into compatible restriction enzyme sites of the polylinker of a suitable plasmid, e.g., 
PBLUESCRIPT plasmid (Stratagene), PSPORT1 plasmid (Life Technologies), PCDNA2.1 plasmid 
(Invitrogen, Carlsbad CA), PBK-CMV plasmid (Stratagene), PCR2-TOPOTA plasmid (Invitrogen), 
PCMV-ICIS plasmid (Stratagene), pIGEN (Incyte Genomics, Palo Alto CA), or pINCY (Incyte 
Genomics), or derivatives thereof- Recombinant plasmids were transformed into competent E. coli 

10 cells including XLl-Blue, XLl-BlueMRF, or SOLR from Stratagene or DHSa, DH10B, or 
ElectroMAX DH10B from Life Technologies. 
EL Isolation of cDNA Clones 

Plasmids obtained as described in Example I were recovered from host cells by in vivo 
excision using the TTNTZAP vector system (Stratagene) or by cell lysis. Plasmids were purified using 

15 at least one of the following: a Magic or WIZARD Minipreps DNA purification system (Promega); an 
AGTC Miniprep purification kit (Edge Biosystems, Gaithersburg MD); and QIAWELL 8 Plasmid, 
QIAWELL 8 Plus Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the RJB.AX. PREP 96 
plasmid purification kit from QIAGEN. Following precipitation, plasmids were resuspended in 0.1 
ml of distilled water and stored, with or without lyophilization, at 4°C. 

20 Alternatively, plasmid DNA was amplified from host cell lysates using direct link PCR in a 

high-throughput format (Rao, V.B. (1994) Anal. Biochem. 216:1-14). Host cell lysis and thermal 
cycling steps were carried out in a single reaction mixture. Samples were processed and stored in 
3 84- well plates, and the concentration of amplified plasmid DNA was quantified fluorometrically 
using PICOGREEN dye (Molecular Probes, Eugene OR) and a FLUOROSKAN II fluorescence 

25 scanner (Labsystems Oy, Helsinki, Finland). 
ID. Sequencing and Analysis 

Incyte cDNA recovered in plasmids as described in Example II were sequenced as follows. 
Sequencing reactions were processed using standard methods or high-throughput instrumentation 
such as the ABI CATALYST 800 (Applied Biosystems) thermal cycler or the PTC-200 thermal 

30 cycler (MJ Research) in conjunction with the HYDRA microdispenser (Robbins Scientific) or the 
MICROLAB 2200 (Hamilton) liquid transfer system. cDNA sequencing reactions were prepared 
using reagents provided by Amersham Pharmacia Biotech or supplied in ABI sequencing kits such as 
the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 
Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides 

35 were carried out using the MEGABACE 1000 DNA sequencing system (Molecular Dynamics); the 
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ABI PRISM 373 or 377 sequencing system (Applied Biosystems) in conjunction with standard ABI 
protocols and base calling software; or other sequence analysis systems known in the art. Reading 
frames within the cDNA sequences were identified using standard methods (reviewed in Ausubel, 
1997, supra, unit 7.7). Some of the cDNA sequences were selected for extension using the techniques 
5 disclosed in Example VUL 

The polynucleotide sequences derived from Incyte cDNAs were validated by removing 
vector, linker, and poly(A) sequences and by masking ambiguous bases, using algorithms and 
programs based on BLAST, dynamic programming, and dinucleotide nearest neighbor analysis. The 
Incyte cDNA sequences or translations thereof were then queried against a selection of public 

10 databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases, and 
BLOCKS, PRINTS, DOMO, PRODOM; PROTEOME databases with sequences from Homo sapiens. 
Rattus norvegicus, Mus musculus. Caenorhabditis elegans, Saccharomvces cerevisiae. 
Schizosaccharomvces pombe. and Candida albicans (Incyte Genomics, Palo Alto CA); and hidden 
Markov model (HMM)-based protein family databases such as PFAM. (HMM is a probabilistic 

15 approach which analyzes consensus primary structures of gene families. See, for example, Eddy, 
S.R. (1996) Curr. Opin. Struct. Biol. 6:361-365.) The queries were performed using programs based 
on BLAST, FASTA, BLIMPS, and HMMER. The Incyte cDNA sequences were assembled to 
produce fall length polynucleotide sequences. Alternatively, GenBank cDNAs, GenBank ESTs, 
stitched sequences, stretched sequences, or Genscan-predicted coding sequences (see Examples IV 

20 and V) were used to extend Incyte cDNA assemblages to full length. Assembly was performed using 
programs based on Pbred, Phrap, and Consed, and cDNA assemblages were screened for open 
reading frames using programs based on GeneMark, BLAST, and FASTA. The full length 
polynucleotide sequences were translated to derive the corresponding full length polypeptide 
sequences. Alternatively, a polypeptide of the invention may begin at any of the methionine residues 

25 of the full length translated polypeptide. Full length polypeptide sequences were subsequently 

analyzed by querying against databases such as the GenBank protein databases (genpept), SwissProt, 
the PROTEOME databases, BLOCKS, PRINTS, DOMO, PRODOM, Prosite, and hidden Markov 
model (HMM)-based protein family databases such as PFAM. Full length polynucleotide sequences 
are also analyzed using MACDNASIS PRO software (Hitachi Software Engineering, South San 

30 Francisco CA) and LASERGENE software (DNASTAR). Polynucleotide and polypeptide sequence 
alignments are generated using default parameters specified by the CLUSTAL algorithm as 
incorporated into the MEGALIGN multisequence alignment program (DNASTAR), which also 
calculates the percent identity between aligned sequences. 

Table 7 summarizes the tools, programs, and algorithms used for the analysis and assembly of 

35 Incyte cDNA and full length sequences and provides applicable descriptions, references, and 
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threshold parameters. The first column of Table 7 shows the tools, programs, and algorithms used, 
the second column provides brief descriptions thereof, the third column presents appropriate 
references, all of which are incorporated by reference herein in their entirety, and the fourth column 
presents, where applicable, the scores, probability values, and other parameters used to evaluate the 
5 strength of a match between two sequences (the higher the score or the lower the probability value, 
the greater the identity between two sequences). 

The programs described above for the assembly and analysis of full length polynucleotide 
and polypeptide sequences were also used to identify polynucleotide sequence fragments from SEQ 
ID NO.21-40. Fragments from about 20 to about 4000 nucleotides which are useful in hybridization 
10 and amplification technologies are described in Table 4, column 4. 

IV. Identification and Editing of Coding Sequences from Genomic DNA 

Putative transporters and ion channels were initially identified by running the Genscan gene 
identification program against public genomic sequence databases (e.g., gbpri and gbhtg). Genscan is 
a general-purpose gene identification program which analyzes genomic DNA sequences from a 

15 variety of organisms (See Burge, C. and S. Karlin (1997) J. Mol. Biol. 268:78-94, and Burge, C and 
S. Karlin (1998) Curr. Opin. Struct Biol. 8:346-354). The program concatenates predicted exons to 
form an assembled cDNA sequence extending from a methionine to a stop codon. The output of 
Genscan is a FASTA database of polynucleotide and polypeptide sequences. The maximum range of 
sequence for Genscan to analyze at once was set to 30 kb. To determine which of these Genscan 

20 predicted cDNA sequences encode transporters and ion channels, the encoded polypeptides were 
analyzed by querying against PFAM models for transporters and ion channels. Potential transporters 
and ion channels were also identified by homology to Ihcyte cDNA sequences that had been 
annotated as transporters and ion channels. These selected Genscan-predicted sequences were then 
compared by BLAST analysis to the genpept and gbpri public databases. Where necessary, the 

25 Genscan-predicted sequences were then edited by comparison to the top BLAST hit from genpept to 
correct errors in the sequence predicted by Genscan, such as extra or omitted exons. BLAST analysis 
was also used to find any Incyte cDNA or public cDNA coverage of the Genscan-predicted 
sequences, thus providing evidence for transcription. When Incyte cDNA coverage was available, 
this information was used to correct or confirm the Genscan predicted sequence. Full length 

30 polynucleotide sequences were obtained by assembling Genscan-predicted coding sequences with 
Incyte cDNA sequences and/or public cDNA sequences using the assembly process described in 
Example IH. Alternatively, full length polynucleotide sequences were derived entirely from edited or 
unedited Genscan-predicted coding sequences. 

V. Assembly f Gen mic Sequence Data with cDNA Sequence Data 
35 "Stitched" Sequences 
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Partial cDNA sequences were extended with exons predicted by the Genscan gene 
identification program described in Example IV. Partial cDNAs assembled as described in Example 
HI were mapped to genomic DNA and parsed into clusters containing related cDNAs and Genscan 
exon predictions from one or more genomic sequences. Each cluster was analyzed using an algorithm 
5 based on graph theory and dynamic programming to integrate cDNA and genomic information, 
generating possible splice variants that were subsequently confirmed, edited, or extended to create a 
full length sequence. Sequence intervals in which the entire length of the interval was present on 
more than one sequence in the cluster were identified, and intervals thus identified were considered to 
be equivalent by transitivity. For example, if an interval was present on a cDNA and two genomic 

10 sequences, then all three intervals were considered to be equivalent This process allows unrelated 
but consecutive genomic sequences to be brought together, bridged by cDNA sequence. Intervals 
thus identified were then "stitched" together by the stitching algorithm in the order that they appear 
along their parent sequences to generate the longest possible sequence, as well as sequence variants. 
Linkages between intervals which proceed along one type of parent sequence (cDNA to cDNA or 

15 genomic sequence to genomic sequence) were given preference over linkages which change parent 
type (cDNA to genomic sequence). The resultant stitched sequences were translated and compared 
by BLAST analysis to the genpept and gbpri public databases. Incorrect exons predicted by Genscan 
were corrected by comparison to the top BLAST hit from genpept Sequences were further extended 
with additional cDNA sequences, or by inspection of genomic DNA, when necessary. 

20 "Stretched" Sequences 

Partial DNA sequences were extended to full length with an algorithm based on BLAST 
analysis. First, partial cDNAs assembled as described in Example III were queried against public 
databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases 
using the BLAST program. The nearest GenBank protein homolog was then compared by BLAST 

25 analysis to either Incyte cDNA sequences or GenScan exon predicted sequences described in 
Example IV. A chimeric protein was generated by using the resultant high-scoring segment pairs 
(HSPs) to map the translated sequences onto the GenBank protein homolog. Insertions or deletions 
may occur in the chimeric protein with respect to the original GenBank protein homolog. The 
GenBank protein homolog, the chimeric protein, or both were used as probes to search for 

30 homologous genomic sequences from the public human genome databases. Partial DNA sequences 
were therefore "stretched" or extended by the addition of homologous genomic sequences. The 
resultant stretched sequences were examined to determine whether it contained a complete gene. 
VI. Chromosomal Mapping of TRICH Encoding P lynucie tides 

The sequences which were used to assemble SEQ ID NO:21-40 were compared with 

35 sequences from the Incyte LDFESEQ database and public domain databases using BLAST and other 
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implementations of the Smith-Waterman algorithm. Sequences from these databases that matched 
SEQ ID NO:21-40 were assembled into clusters of contiguous and overlapping sequences using 
assembly algorithms such as Phrap (Table 7). Radiation hybrid and genetic mapping data available 
from public resources such as the Stanford Human Genome Center (SHGQ, Whitehead Institute for 
5 Genome Research (WIGR), and Genethon were used to determine if any of the clustered sequences 
had been previously mapped. Inclusion of a mapped sequence in a cluster resulted in the assignment 
of all sequences of that cluster, including its particular SEQ ID NO:, to that map location. 

Map locations are represented by ranges, or intervals, of human chromosomes. The map 
position of an interval, in centiMorgans, is measured relative to the terminus of the chromosome's p- 

10 arm. (The centiMorgan (cM) is a unit of measurement based on recombination frequencies between 
chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in 
humans, although this can vary widely due to hot and cold spots of recombination.) The cM 
distances are based on genetic markers mapped by G6n6thon which provide boundaries for radiation 
hybrid markers whose sequences were included in each of the clusters. Human genome maps and 

15 other resources available to the public, such as the NCBI "GeneMap'99" World Wide Web site 
(http^/www.ncbi.nlmjtiih.gov/genemap/), can be employed to determine if previously identified 
disease genes map within or in proximity to the intervals indicated above. 
VII. Analysis of Polynucleotide Expression 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a 

20 gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs 
from a particular cell type or tissue have been bound. (See, e.g., Sambrook, supra, ch. 7; Ausubel 
(1995) supra, ch. 4 and 16.) 

Analogous computer techniques applying BLAST were used to search for identical or related 
molecules in cDNA databases such as GenBank or LIFESEQ (Lacyte Genomics). This analysis is 

25 much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the 

computer search can be modified to determine whether any particular match is categorized as exact or 
similar. The basis of the search is the product score, which is defined as: 

BLAST Score x Percent Identity 
30 5 x minimum { length(Seq. 1), length(Seq. 2) } 

The product score takes into account both the degree of similarity between two sequences and the 
length of the sequence match. The product score is a normalized value between 0 and 100, and is 
calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the 
35 product is divided by (5 times the length of the shorter of the two sequences). The BLAST score is 
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calculated by assigning a score of +5 for every base that matches in a high-scoring segment pair 
(HSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by 
gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate 
the product score. The product score represents a balance between fractional overlap and quality in a 
5 BLAST alignment For exanq>le, a product score of 100 is produced only for 100% identity over the 
entire length of the shorter of the two sequences being compared. A product score of 70 is produced 
either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the 
other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% 
identity and 100% overlap. 

10 Alternatively, polynucleotide sequences encoding TRICH are analyzed with respect to the 

tissue sources from which they were derived. For example, some full length sequences are 
assembled, at least in part, with overlapping Incyte cDNA sequences (see Example HI). Each cDNA 
sequence is derived from a cDNA library constructed from a human tissue. Each human tissue is 
classified into one of the following-organ/tissue categories: cardiovascular system; connective tissue; 

15 digestive system; embryonic structures; endocrine system; exocrine glands; genitalia, female; 
genitalia, male; germ cells; hemic and immune system; liver; musculoskeletal system; nervous 
system; pancreas; respiratory system; sense organs; skin; stomatognathic system; unclassified/mixed; 
or urinary tract. The number of libraries in each category is counted and divided by the total number 
of libraries across all categories. Similarly, each human tissue is classified into one of the following 

20 disease/condition categories: cancer, cell line, developmental, inflammation, neurological, trauma, 
cardiovascular, pooled, and other, and the number of libraries in each category is counted and divided 
by the total number of libraries across all categories. The resulting percentages reflect the tissue- and 
disease-specific expression of cDNA encoding TRICH. cDNA sequences and cDNA library/tissue 
information are found in the UEBSEQ GOLD database (Incyte Genomics, Palo Alto CA). 

25 Vm. Extension of TRICH Encoding Polynucleotides 

Full length polynucleotide sequences were also produced by extension of an appropriate 
fragment of the full length molecule using oligonucleotide primers designed from this fragment. One 
primer was synthesized to initiate 5' extension of the known fragment, and the other primer was 
synthesized to initiate 3' extension of the known fragment The initial primers were designed using 

30 OLIGO 4.06 software (National Biosciences), or another appropriate program, to be about 22 to 30 
nucleotides in length, to have a GC content of about 50% or more, and to anneal to the target 
sequence at temperatures of about 68 °C to about 72 °C. Any stretch of nucleotides which would 
result in hairpin structures and primer-primer dimerizations was avoided. 

Selected human cDNA libraries were used to extend the sequence. If more than one 

35 extension was necessary or desired, additional or nested sets of primers were designed. 
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High fidelity amplification was obtained by PCR using methods well known in the art. PCR 
was performed in 96-well plates using the PTC-200 thermal cycler (MJ Research, Inc.). The reaction 
mix contained DNA template, 200 nmol of each primer, reaction buffer containing Mg 2 *, (NH 4 ) 2 S0 4 , 
and 2-mercaptoethanol, Taq DNA polymerase (Amersham Pharmacia Biotech), ELONGASE enzyme 
5 (Life Technologies), and Pfu DNA polymerase (Stratagene), with the following parameters for primer 
pair PQ A and Pa B: Step 1: 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 68°C, 
2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68°C 5 min; Step 7: storage at 4°C. In the 
alternative, the parameters for primer pair 17 and SK+ were as follows: Step 1: 94°C, 3 min; Step 2: 
94°C, 15 sec; Step 3: 57°C 1 min; Step 4: 68°C 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; 

10 Step 6: 68 °Q 5 min; Step 7: storage at 4°C 

The concentration of DNA in each well was determined by dispensing 100 jil PICOGREEN 
quantitation reagent (0.25% (v/v) PICOGREEN; Molecular Probes, Eugene OR) dissolved in IX TE 
and 0.5 ftl of undiluted PCR product into each well of an opaque fluorimeter plate (Coming Costar, 
Acton MA), allowing the DNA to bind to the reagent The plate was scanned in a Fluoroskan II 

15 (Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the 
concentration of DNA. A 5 iA to 10 /zl aliquot of the reaction mixture was analyzed by 
electrophoresis on a 1 % agarose gel to determine which reactions were successful in extending the 
sequence. 

The extended nucleotides were desalted and concentrated, transferred to 384-well plates, 

20 digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and 
sonicated or sheared prior to religation into pUC 18 vector (Amersham Pharmacia Biotech). For 
shotgun sequencing, the digested nucleotides were separated on low concentration (0.6 to 0.8%) 
agarose gels, fragments were excised, and agar digested with Agar ACE (Promega). Extended clones 
were religated using T4 ligase (New England Biolabs, Beverly MA) into pUC 18 vector (Amersham 

25 Pharmacia Biotech), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction site 
overhangs, and transfected into competent E. coli cells. Transformed cells were selected on 
antibiotic-containing media, and individual colonies were picked and cultured overnight at 37 °C in 
384-well plates in LB/2x carb liquid media. 

The cells were lysed, and DNA was amplified by PCR using Taq DNA polymerase 

30 (Amersham Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the following 

parameters: Step 1: 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 72°C, 2 min; 
Step 5: steps 2, 3, and 4 repeated 29 times; Step 6: 72 °C, 5 min; Step 7: storage at 4°C. DNA was 
quantified by PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA 
recoveries were reamplified using the same conditions as described above. Samples were diluted 

35 with 20% dimethysulfoxide (1:2, v/v), and sequenced using DYENAMIC energy transfer sequencing 
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primers and the DYENAMIC DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM 

BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 

In like manner, full length polynucleotide sequences are verified using the above procedure or 

are used to obtain 5* regulatory sequences using the above procedure along with oligonucleotides 
5 designed for such extension, and an appropriate genomic library. 

EL Labeling and Use of Individual Hybridization Probes 

Hybridization probes derived from SEQ ID NO:21-40 are employed to screen cDNAs, 

genomic DNAs, or mRNAs. Although the labeling of oligonucleotides, consisting of about 20 base 

pairs, is specifically described, essentially the same procedure is used with larger nucleotide 
10 fragments. Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 

software (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 ^Ci of 

[y- 32 P] adenosine triphosphate (Amersham Pharmacia Biotech), and T4 polynucleotide kinase 

(DuPont NEN, Boston MA). The labeled oligonucleotides are substantially purified using a 

SEPHADEX G-25 superfine size exclusion dextran bead column (Amersham Pharmacia Biotech). 
15 An aliquot containing 10 7 counts per minute of the labeled probe is used in a typical membrane-based 

hybridization analysis of human genomic DNA digested with one of the following endonucleases: 

Ase I, Bgl H, EcoRI, Pst I, Xba I, or Pvu H (DuPont NEN). 

The DNA from each digest is fractionated on a 0.7% agarose gel and transferred to nylon 

membranes (Nytran Plus, Schleicher & Schuell, Durham NH). Hybridization is carried out for 16 
20 hours at 40°C To remove nonspecific signals, blots are sequentially washed at room temperature 

under conditions of up to, for example, 0. 1 x saline sodium citrate and 0.5% sodium dodecyl sulfate. 

Hybridization patterns are visualized using autoradiography or an alternative imaging means and 

compared. 

X. Microarrays 

25 The linkage or synthesis of array elements upon a nricroarray can be achieved utilizing 

photolithography, piezoelectric printing (ink-jet printing, See, e.g., Baldeschweiler, supra .), 
mechanical microspotting technologies, and derivatives thereof. The substrate in each of the 
aforementioned technologies should be uniform and solid with a non-porous surface (Schena (1999), 
supra) . Suggested substrates include silicon, silica, glass slides, glass chips, and silicon wafers. 

30 Alternatively, a procedure analogous to a dot or slot blot may also be used to arrange and link 
elements to the surface of a substrate using thermal, UV, chemical, or mechanical bonding 
procedures. A typical array may be produced using available methods and machines well known to 
those of ordinary skill in the art and may contain any appropriate number of elements. (See, e.g., 
Schena, M. et al. (1995) Science 270:467^70; Shalon, D. et al. (1996) Genome Res. 6:639-645; 

35 Marshall, A. and J. Hodgson (1998) Nat Biotechnol. 16:27-31.) 
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Full length cDNAs, Expressed Sequence Tags (ESTs), or fragments or oligomers thereof may 
comprise the elements of the microarray. Fragments or oligomers suitable for hybridization can be 
selected using software well known in the art such as LASERGENE software (DNASTAR). The 
array elements are hybridized with polynucleotides in a biological sample. The polynucleotides in the 

5 biological sample are conjugated to a fluorescent label or other molecular tag for ease of detection. 
After hybridization, nonhybridized nucleotides from the biological sample are removed, and a 
fluorescence scanner is used to detect hybridization at each array element Alternatively, laser 
desorbtion and mass spectrometry may be used for detection of hybridization. The degree of 
complementarity and the relative abundance of each polynucleotide which hybridizes to an element 

10 on the microarray may be assessed. In one embodiment, microarray preparation and usage is 
described in detail below. 
Tissue or Cell Sample Preparation 

Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and 
poly(A) + RNA is purified using the oligo-(dT) cellulose method. Each poly(A) + RNA sample is 

15 reverse transcribed using MMLV reverse-transcriptase, 0.05 pg/pl oligo-(dT) primer (21mer), IX 
first strand buffer, 0.03 units//xl RNase inhibitor, 500 ftM dATP, 500 pM dGTP, 500 pM dTTP, 40 

dCTP, 40 fxM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Pharmacia Biotech). The reverse 
transcription reaction is performed in a 25 ml volume containing 200 ng poly (A) + RNA with 
GEMBRIGHT kits (Incyte). Specific control poly(A) + RNAs are synthesized by in vitro transcription 

20 from non-coding yeast genomic DNA. After incubation at 37° C for 2 hr, each reaction sample (one 
with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and 
incubated for 20 minutes at 85°C to the stop the reaction and degrade the RNA. Samples are purified 
using two successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc. 
(CLONTECH), Palo Alto CA) and after combining, both reaction samples are ethanol precipitated 

25 using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The sample is 
then dried to completion using a SpeedVAC (Savant Instruments Inc., Holbrook NY) and 
resuspended in 14 jil 5X SSC/0.2% SDS. 
Microarrav Preparation 

Sequences of the present invention are used to generate array elements. Each array element 

30 is amplified from bacterial cells containing vectors with cloned cDNA inserts. PCR amplification 
uses primers complementary to the vector sequences flanking the cDNA insert Array elements are 
amplified in thirty cycles of PCR from an initial quantity of 1-2 ng to a final quantity greater than 5 
fig. Amplified array elements are then purified using SEPHACRYL-400 (Amersham Pharmacia 
Biotech). 

35 Purified array elements are immobilized on polymer-coated glass slides. Glass microscope 
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slides (Corning) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distilled water 
washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR 
Scientific Products Corporation (VWR), West Chester PA), washed extensively in distilled water, 
and coated with 0.05% aminopropyl silane (Sigma) in 95% ethanol. Coated slides are cured in a 
5 110°Coven. 

Array elements are applied to the coated glass substrate using a procedure described in U.S. 
Patent No. 5,807,522, incorporated herein by reference. 1 fil of the array element DNA, at an average 
concentration of 100 ng/pl, is loaded into the open capillary printing element by a high-speed robotic 
apparatus. The apparatus then deposits about 5 nl of array element sample per slide. 

10 Microarrays are UV-crosslinked using a STRATALINKER UV-crosslinker (Stratagene). 

Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
buffered saline (PBS) (Tropix, Inc., Bedford MA) for 30 minutes at 60° C followed by washes in 
0.2% SDS and distilled water as before. 

15 Hybridization 

Hybridization reactions contain 9 fil of sample mixture consisting of 0.2 fig each of Cy3 and 
Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The sample 
mixture is heated to 65°C for 5 minutes and is aliquoted onto the microarray surface and covered 
with an 1.8 cm 2 coverslip. The arrays are transferred to a waterproof chamber having a cavity just 

20 slightly larger than a microscope slide. The chamber is kept at 100% humidity internally by the 
addition of 140 fil of 5X SSC in a comer of the chamber. The chamber containing the arrays is 
incubated for about 6.5 hours at 60° C The arrays are washed for 10 min at 45° C in a first wash 
buffer (IX SSC, 0.1% SDS), three times for 10 minutes each at 45° C in a second wash buffer (0.1X 
SSC), and dried. 

25 Detection 

Reporter-labeled hybridization complexes are detected with a microscope equipped with an 
Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral lines 
at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser light is 
focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The slide 

30 containing the array is placed on a computer-controlled X-Y stage on the microscope and raster- 
scanned past the objective. The 1.8 cm x 1.8 cm array used in the present exanople is scanned with a 
resolution of 20 micrometers. 

In two separate scans, a mixed gas multiline laser excites the two fluorophores sequentially. 
Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, 

35 Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate 



82 



WO 02/40541 



PCT/US01/46055 



filters positioned between the array and the photomultiplier tubes are used to filter the signals. The 
emission maxima of the fluorophores used are 565 nm for Cy3 and 650 nm for Cy5. Each array is 
typically scanned twice, one scan per fluorophore using the appropriate filters at the laser source, 
although the apparatus is capable of recording the spectra from both fluorophores simultaneously. 
5 The sensitivity of the scans is typically calibrated using the signal intensity generated by a 

cDNA control species added to the sample mixture at a known concentration. A specific location on 
the array contains a complementary DNA sequence, allowing the intensity of the signal at that 
location to be correlated with a weight ratio of hybridizing species of 1 : 100,000. When two samples 
from different sources (e.g.» representing test and control cells), each labeled with a different 

10 fluorophore, are hybridized to a single array for the purpose of identifying genes that are 

differentially expressed, the calibration is done by labeling samples of the calibrating cDNA with the 
two fluorophores and adding identical amounts of each to the hybridization mixture. 

The output of the photomultiplier tube is digitized using a 12-bit RTI-835H analog-to-digital 
(A/D) conversion board (Analog Devices, Inc., Norwood MA) installed in an IBM-compatible PC 

15 computer. The digitized data are displayed as an image where the signal intensity is mapped using a 
linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high 
signal). The data is also analyzed quantitatively. Where two different fluorophores are excited and 
measured simultaneously, the data are first corrected for optical crosstalk (due to overlapping 
emission spectra) between the fluorophores using each fluorophore' s emission spectrum. 

20 A grid is superimposed over the fluorescence signal image such that the signal from each 

spot is centered in each element of the grid. The fluorescence signal within each element is then 
integrated to obtain a numerical value corresponding to the average intensity of the signal. The 
software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte). 

XI. Complementary Polynucleotides 

25 Sequences complementary to the TRICH-encoding sequences, or any parts thereof, are used 

to detect, decrease, or inhibit expression of naturally occurring TRICH. Although use of 
oligonucleotides comprising from about 15 to 30 base pairs is described, essentially the same 
procedure is used with smaller or with larger sequence fragments. Appropriate oligonucleotides are 
designed using OLIGO 4.06 software (National Biosciences) and the coding sequence of TRICH. To 

30 inhibit transcription, a complementary oligonucleotide is designed from the most unique 5* sequence 
and used to prevent promoter binding to the coding sequence. To inhibit translation, a 
complementary oligonucleotide is designed to prevent ribosomal binding to the TRICH-encoding 
transcript. 

XII. Expression f TRICH 

35 Expression and purification of TRICH is achieved using bacterial or virus-based expression 
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systems. For expression of TRICH in bacteria, cDNA is subcloned into an appropriate vector 
containing an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA 
transcription. Examples of such promoters include, but are not limited to, the trp-lac (tac) hybrid 
promoter and the T5 or 17 bacteriophage promoter in conjunction with the lac operator regulatory 
5 element. Recombinant vectors are transformed into suitable bacterial hosts, e.g., BL21(DE3). 
Antibiotic resistant bacteria express TRICH upon induction with isoprupyl beta-D- 
thiogalactopyranoside (IPTG). Expression of TRICH in eukaryotic cells is achieved by infecting 
insect or mammalian cell lines with recombinant Autographica calif ornica nuclear polyhidrosis virus 
(AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of baculovirus is 

10 replaced with cDNA encoding TRICH by either homologous recombination or bacterial-mediated 
transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong 
polyhedrin promoter drives high levels of cDNA transcription. Recombinant baculovirus is used to 
infect Spodoptera frugjperda (Sf9) insect cells in most cases, or human hepatocytes, in some cases. 
Infection of the latter requires additional genetic modifications to baculovirus. (See Engelhard, ELK. 

15 et al. (1994) Ptoc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. 
7:1937-1945.) 

hi most expression systems, TRICH is synthesized as a fusion protein with, e.g., glutathione 
S-transferase (GST) or a peptide epitope tag, such as FLAG or 6-KBs, permitting rapid, single-step, 
affinity-based purification of recombinant fusion protein from crude cell lysates. GST, a 26- 

20 kilodalton enzyme from Schistosoma iaponicum, enables the purification of fusion proteins on 

immobilized glutathione under conditions that maintain protein activity and antigenicity (Amersham 
Pharmacia Biotech). Following purification, the GST moiety can be proteolytically cleaved from 
TRICH at specifically engineered sites. FLAG, an 8-amino acid peptide, enables immunoaffinity 
purification using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman 

25 Kodak). 6-His, a stretch of six consecutive histidine residues, enables purification on metal-chelate 
resins (QIAGEN). Methods for protein expression and purification are discussed in Ausubel (1995, 
supra, ch. 10 and 16). Purified TRICH obtained by these methods can be used directly in the assays 
shown in Examples XVI, XVII, and XVIII, where applicable. 
Xlil. Functional Assays 

30 TRICH function is assessed by expressing the sequences encoding TRICH at physiologically 

elevated levels in mammalian cell culture systems. cDNA is subcloned into a mammali an expression 
vector containing a strong promoter that drives high levels of cDNA expression. Vectors of choice 
include PCMV SPORT (Life Technologies) and PCR3.1 (Invitrogen, Carlsbad CA), both of which 
contain the cytomegalovirus promoter. 5-10 \x% of recombinant vector are transiently transfected into 

35 a human cell line, for example, an endothelial or hematopoietic cell line, using either liposome 
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formulations or electroporation. 1-2 y% of an additional plasmid containing sequences encoding a 
marker protein are co-transfected. Expression of a marker protein provides a means to distinguish 
transfected cells from nontransfected cells and is a reliable predictor of cDNA expression from the 
recombinant vector. Marker proteins of choice include, e.g., Green Fluorescent Protein (GFP; 
5 Clontech), CD64, or a CD64-GFP fusion protein. Flow cytometry (FCM), an automated, laser optics- 
based technique, is used to identify transfected cells expressing GFP or CD64-GEP and to evaluate 
the apoptotic state of the cells and other cellular properties. FCM detects and quantifies the uptake of 
fluorescent molecules that diagnose events preceding or coincident with cell death. These events 
include changes in nuclear DNA content as measured by staining of DNA with propidium iodide; 

10 changes in cell size and granularity as measured by forward light scatter and 90 degree side light 
scatter, down-regulation of DNA synthesis as measured by decrease in bromodeoxyuridine uptake; 
alterations in expression of cell surface and intracellular proteins as measured by reactivity with 
specific antibodies; and alterations in plasma membrane composition as measured by the binding of 
fluorescein-conjugated Annexin V protein to the cell surface. Methods in flow cytometry are 

15 discussed in Ormerod, M.G. (1994) Flow Cytometry, Oxford, New York NY, 

The influence of TRICH on gene expression can be assessed using highly purified 
populations of cells transfected with sequences encoding TRICH and either CD64 or CD64-GFP. 
CD64 and CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions 
of human immunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected 

20 cells using magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Lake 
Success NY). mRNA can be purified from the cells using methods well known by those of skill in 
the art Expression of mRNA encoding TRICH and other genes of interest can be analyzed by 
northern analysis or microarray techniques. 
XTV. Production of TRICH Specific Antibodies 

25 TRICH substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 

Harrington, M.G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to 
immunize rabbits and to produce antibodies using standard protocols. 

Alternatively, the TRICH amino acid sequence is analyzed using LASERGENE software 
(DNASTAR) to determine regions of high immunogenicity, and a corresponding oligopeptide is 

30 synthesized and used to raise antibodies by means known to those of skill in the art. Methods for 
selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions are well 
described in the art. (See, e.g., Ausubel, 1995, supra, ch. 1 1.) 

Typically, oligopeptides of about 15 residues in length are synthesized using an ABI 431 A 
peptide synthesizer (Applied Biosystems) using FMOC chemistry and coupled to KLH (Sigma- 

35 Aldrich, St. Louis MO) by reaction with N-mdeimidobenzoyl-N-hydroxysuccinimide ester (MBS) to 



85 



WO 02/40541 



PCT/US01/46055 



increase immunogenicity. (See, e.g., Ausubel, 1995, supra O Rabbits are immunized with the 
oligopeptide-KLH complex in complete Freund's adjuvant Resulting antisera are tested for 
antipeptide and anti-TRICH activity by, for example, binding the peptide or TRICH to a substrate, 
blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radio-iodinated goat 
5 anti-rabbit IgG. 

XV. Purification of Naturally Occurring TRICH Using Specific Antibodies 
Naturally occurring or recombinant TRICH is substantially purified by immunoaffinity 

chromatography using antibodies specific for TRICH. An immunoaffinity column is constructed by 
covalently coupling anti-TRICH antibody to an activated chromatographic resin, such as 

10 CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After the coupling, the rain is 
blocked and washed according to the manufacturer's instructions. 

Media containing TRICH are passed over the immunoaffinity column, and the column is 
washed under conditions that allow the preferential absorbance of TRICH (e.g., high ionic strength 
buffers in the presence of detergent). The column is eluted under conditions that disrupt 

15 antibody/TRICH binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such 
as urea or thiocyanate ion), and TRICH is collected. 

XVI. Identification of Molecules Which Interact with TRICH 

Molecules which interact with TRICH may include transporter substrates, agonists or 
antagonists, modulatory proteins such as Gfty proteins (Reimann, supra) or proteins involved in 

20 TRICH localization or clustering such as MAGUKs (Craven, supra) . TRICH, or biologically active 
fragments thereof, are labeled with 125 I Bolton-Hunter reagent. (See, e.g., Bolton A.E. and W.M. 
Hunter (1973) Biochemu J. 133:529-539.) Candidate molecules previously arrayed in the wells of a 
multi-well plate are incubated with the labeled TRICH, washed, and any wells with labeled TRICH 
complex are assayed. Data obtained using different concentrations of TRICH are used to calculate 

25 values for the number, affinity, and association of TRICH with the candidate molecules. 

Alternatively, proteins that interact with TRICH are isolated using the yeast 2-hybrid system 
(Fields, S. and O. Song (1989) Nature 340:245-246). TRICH, or fragments thereof, are expressed as 
fusion proteins with the DNA binding domain of Gal4 or lexA, and potential interacting proteins are 
expressed as fusion proteins with an activation domain. Meractions between the TRICH fusion 

30 protein and the TRICH interacting proteins (fusion proteins with an activation domain) reconstitute a 
transactivation function that is observed by expression of a reporter gene. Yeast 2-hybrid systems are 
commercially available, and methods for use of the yeast 2-hybrid system with ion channel proteins 
are discussed in Niethammer, M. and M. Sheng (1998, Meth. Enzymol. 293:104-122). 

TRICH may also be used in the PATHCALL1NG process (CuraGen Corp., New Haven CT) 

35 which employs the yeast two-hybrid system in a high-throughput manner to determine all interactions 
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between the proteins encoded by two large libraries of genes (Nandabalan, K. et al. (2000) U.S. 
Patent No. 6,057,101). 

Potential TRICH agonists or antagonists may be tested for activation or inhibition of TRICK 
ion channel activity using the assays described in section XVDL 
5 XVII. Demonstration of TRICH Activity 

Ion channel activity of TRICH is demonstrated using an electrophysiological assay for ion 
conductance. TRICH can be expressed by transforming a mammalian cell line such as COS7, HeLa 
or CHO with a eukaryotic expression vector encoding TRICH. Eukaryotic expression vectors are 
commercially available, and the techniques to introduce them into cells are well known to those 

10 skilled in the art A second plasmid which expresses any one of a number of marker genes, such as 6- 
galactosidase, is co-transformed into the cells to allow rapid identification of those cells which have 
taken up and expressed the foreign DNA. The cells are incubated for 48-72 hours after 
transformation under conditions appropriate for the cell line to allow expression and accumulation of 
TRICH and B-galactosidase. 

15 Transformed cells expressing B-galactosidase are stained blue when a suitable colorimetric 

substrate is added to the culture media under conditions that are well known in the art Stained cells 
are tested for differences in membrane conductance by electrophysiological techniques that are well 
known in the art Untransformed cells, and/or cells transformed with either vector sequences alone or * 
B-galactosidase sequences alone, are used as controls and tested in parallel. Cells expressing TRICH 

20 will have higher anion or cation conductance relative to control cells. The contribution of TRICH to 
conductance can be confirmed by incubating the cells using antibodies specific for TRICH. The 
antibodies will bind to the extracellular side of TRICH, thereby blocking the pore in the ion channel, 
and the associated conductance. 

Alternatively, ion channel activity of TRICH is measured as current flow across a TRICH- 

25 containing Xenopus laevis oocyte membrane using the two-electrode voltage-clamp technique (Ishi et 
al., supra; Jegla, T. and L. Salkoff (1997) J. Neurosci. 17:32-44). TRICH is subcloned into an 
appropriate Xenopus oocyte expression vector, such as pBF, and 0.5-5 ng of mRNA is injected into 
mature stage IV oocytes. Injected oocytes are incubated at 18 °C for 1-5 days. Inside-out 
macropatches are excised into an intracellular solution containing 1 16 mM K-gluconate, 4 mM KQ, 

30 and 10 mM Hepes (pH 7.2). The intracellular solution is supplemented with varying concentrations 
of the TRICH mediator, such as cAMP, cGMP, or Ca +2 (in the form of CaCy, where appropriate. 
Electrode resistance is set at 2-5 MQ and electrodes are filled with the intracellular solution lacking 
mediator. Experiments are performed at room temperature from a holding potential of 0 mV. 
Voltage ramps (2.5 s) from -100 to 100 mV are acquired at a sampling frequency of 500 Hz. Current 

35 measured is proportional to the activity of TRICH in the assay. 
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In particular, the activity of TRICH-2 is measured as voltage-gated Ca 2+ or Na + conductance, 
the activity of TRICH-15 is measured as Ca 2+ conductance, and the activity of TRICH-16 is measured 
as K + conductance. 

Transport activity of TRICH is assayed by measuring uptake of labeled substrates into 
5 Xenopus laevis oocytes. Oocytes at stages V and VI are injected with TRICH mRNA (10 ng per 
oocyte) and incubated for 3 days at 18°C in OR2 medium (82.5mM NaCl, 2.5 mM KQ, ImM CaQ 2 , 
ImM MgCl 2 , ImM NajHPO^, 5 mM Hepes, 3.8 mM NaOH , 50/igtol gentamycin, pH 7.8) to allow 
expression of TRICH. Oocytes are then transferred to standard uptake medium (lOOmM NaCl, 2 mM 
KC1, ImM CaCl 2 , ImM MgCl 2t 10 mM Hepes/Tris pH 7.5). Uptake of various substrates (e.g., 

10 amino acids, sugars, drugs, ions, and neurotransmitters) is initiated by adding labeled substrate (e.g. 
radiolabeled with 3 H, fluorescently labeled with rhodamdne, etc.) to the oocytes. After incubating for 
30 minutes, uptake is terminated by washing the oocytes three times in Na + -fiee medium, measuring 
the incorporated label, and comparing with controls. TRICH activity is proportional to the level of 
internalized labeled substrate. In particular, test substrates include tricarboxylates for TRICH-1, H + 

15 for TRICH-3, sulfate for TRICH-4, Na + for TRICH-5, anionic metabolites for TRICH-6, glucoses- 
phosphate for TRICH-8, and amino acids for TRICH- 10. 

ATPase activity associated with TRICH can be measured by hydrolysis of radiolabeled ATP- 
[Y- 32 P], separation of the hydrolysis products by chromatographic methods, and quantitation of the 
recovered 32 P using a scintillation counter. The reaction mixture contains ATP-[y- 32 P] and varying 

20 amounts of TRICH in a suitable buffer incubated at 37 °C for a suitable period of time. The reaction 
is terminated by acid precipitation with trichloroacetic acid and then neutralized with base, and an 
aliquot of the reaction mixture is subjected to membrane or filter paper-based chromatography to 
separate the reaction products. The amount of ^ liberated is counted in a scintillation counter. The 
amount of radioactivity recovered is proportional to the ATPase activity of TRICH in the assay. 

25 Lipocalin activity of TRICH is measured by ligand fluorescence enhancement 

spectrofluorometry (Lin et al. (1997) Molecular Vision 3:17). Examples of ligands include ietinol 
(Sigma, St Louis MO) and 16-anthryloxy-pahnitic acid (16-AP) (Molecular Probes Inc., Eugene OR). 
Ligand is dissolved in 100% ethanol and its concentration is estimated using known extinction 
coefficents (retinol: 46,000 A/M/cm at 325 nm; 16-AP: 8,200 A/M/cm at 361 nm). A 700 /xl aliquot 

30 of 1 /iM TRICH in 10 mM Tris (pH 7.5), 2 mM EDTA, and 500 mM NaCl is placed in a 1 cm path 
length quartz cuvette and 1 /tl aliquots of ligand solution are added. Fluorescence is measured 100 
seconds after each addition until readings are stable. Change in fluorescence per unit change in 
ligand concentration is proportional to TRICH activity. 
XVm. Identificati n f TRICH Agonists and Antag nists 

35 TRICH is expressed in a eukaryotic cell line such as CHO (Chinese Hamster Ovary) or HEK 



88 



WO 02/40541 PCT/US01/46055 

(Human Embryonic Kidney) 293. Ion channel activity of the transformed cells is measured in the 
presence and absence of candidate agonists or antagonists. Ion channel activity is assayed using 
patch clamp methods well known in the art or as described in Example XVH Alternatively, ion 
channel activity is assayed using fluorescent techniques that measure ion flux across the cell 
5 membrane (Velicelebi, G. et al. (1999) Meth. Enzymol. 294:20-47; West, M.R. and C.R. Molloy 
(1996) Anal. Biochem. 241:51-58). These assays may be adapted for high-throughput screening 
using microplates. Changes in internal ion concentration are measured using fluorescent dyes such as 
the Ca 2 * indicator Fluo-4 AM, sodium-sensitive dyes such as SBFI and sodium green, or the CI" 
indicator MQAE (all available from Molecular Probes) in combination with the FLIPR fluorimetric 

10 plate reading system (Molecular Devices). In a more generic version of this assay, changes in 
membrane potential caused by ionic flux across the plasma membrane are measured using oxonyl 
dyes such as DiBAQ (Molecular Probes). DiBAC 4 equilibrates between the extracellular solution 
and cellular sites according to the cellular membrane potential. The dye's fluorescence intensity is 
20-fold greater when bound to hydrophobic intracellular sites, allowing detection of DiBAQ entry 

15 into the cell (Gonzalez, J.K and P.A. Negulescu (1998) Curr. Opin. Biotechnol. 9:624-631). 

Candidate agonists or antagonists may be selected from known ion channel agonists or antagonists, 
peptide libraries, or combinatorial chemical libraries. 

Various modifications and variations of the described methods and systems of the invention 
will be apparent to those skilled in the art without departing from the scope and spirit of the 
invention. Although the invention has been described in connection with certain embodiments, it 
should be understood that the invention as claimed should not be unduly limited to such specific 
embodiments. Indeed, various modifications of the described modes for carrying out the invention 
which are obvious to those skilled in molecular biology or related fields are intended to be within the 
scope of the following claims. 
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What is claimed is: 

1 . An isolated polypeptide selected from the group consisting of: 

a) a polypeptide comprising an amino acid sequence selected from the group consisting 
5 ofSEQIDNO:l-20, 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% 
identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:1-20, 

c) a biologically active fragment of a polypeptide having an amino acid sequence 
10 selected from the group consisting of SEQ ID NO: 1-20, and 

d) an immunogenic fragment of a polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-20. 

2. An isolated polypeptide of claim 1 comprising an amino acid sequence selected from the 
15 group consisting of SEQ ID NO: 1-20. 

3. An isolated polynucleotide encoding a polypeptide of claim 1. 

4. An isolated polynucleotide encoding a polypeptide of claim 2. 

20 

5. An isolated polynucleotide of claim 4 comprising a polynucleotide sequence selected from 
the group consisting of SEQ ID NO:21-40. 

6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
25 polynucleotide of claim 3. 

7. A cell transformed with a recombinant polynucleotide of claim 6. 

8. A transgenic organism comprising a recombinant polynucleotide of claim 6. 

30 

9. A method of producing a polypeptide of claim 1, the method comprising: 

a) culturing a cell under conditions suitable for expression of the polypeptide, wherein 
said cell is transformed with a recombinant polynucleotide, and said recombinant 
polynucleotide comprises a promoter sequence operably linked to a polynucleotide 
35 encoding the polypeptide of claim 1, and 
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b) recovering the polypeptide so expressed. 

10. A method of claim 9, wherein the polypeptide comprises an amino acid sequence 
selected from the group consisting .of SEQ ID NO: 1-20. 

5 

1 1. An isolated antibody which specifically binds to a polypeptide of claim 1. 

12. An isolated polynucleotide selected from the group consisting of: 

a) a polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of SEQ ID NO:21-40, 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from the group consisting of 
SEQIDNO:21-40, 

c) a polynucleotide complementary to a polynucleotide of a), 

d) a polynucleotide complementary to a polynucleotide of b), and 

e) an RNA equivalent of a)-d). 

13. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a 
polynucleotide of claim 12. 

20 

14. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 12, the method comprising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
comprising a sequence complementary to said target polynucleotide in the sample, 

25 and which probe specifically hybridizes to said target polynucleotide, under 

conditions whereby a hybridization complex is formed between said probe and said 
target polynucleotide or fragments thereof, and 

b) detecting the presence or absence of said hybridization complex, and, optionally, if 
present, the amount thereof. 

30 

15. A method of claim 14, wherein the probe comprises at least 60 contiguous nucleotides. 

16. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 12, the method comprising: 

35 a) amplifying said target polynucleotide or fragment thereof using polymerase chain 
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reaction amplification, and 
b) detecting the presence or absence of said amplified target polynucleotide or fragment 
thereof, and, optionally, if present, the amount thereof. 

17. A composition comprising a polypeptide of claim 1 and a pharmaceutical^ acceptable 
excipient 

18. A composition of claim 17, wherein the polypeptide comprises an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-20. 

19. A method for treating a disease or condition associated with decreased expression of 
functional TRICH, comprising administering to a patient in need of such treatment the composition of 
claim 17. 

20. A method of screening a compound for effectiveness as an agonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting agonist activity in the sample. 

21. A composition comprising an agonist compound identified by a method of claim 20 and a 
pharmaceutical^ acceptable excipient 

22. A method for treating a disease or condition associated with decreased expression of 
functional TRICH, comprising administering to a patient in need of such treatment a composition of 
claim 21. 

23. A method of screening a compound for effectiveness as an antagonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting antagonist activity in the sample. 

24. A composition comprising an antagonist compound identified by a method of claim 23 
and a pharmaceutical^ acceptable excipient 

25. A method for treating a disease or condition associated with overexpression of functional 
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TRICH, comprising administering to a patient in need of such treatment a composition of claim 24. 

26. A method of screening for a compound that specifically binds to the polypeptide of claim 

1, the method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound under suitable 

conditions, and 

b) detecting binding of the polypeptide of claim 1 to the test compound, thereby 
identifying a compound that specifically binds to the polypeptide of claim 1. 

27. A method of screening for a compound that modulates the activity of the polypeptide of 
claim 1, the method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound under 
conditions permissive for the activity of the polypeptide of claim 1, 

b) assessing the activity of the polypeptide of claim 1 in the presence of the test 
compound, and 

c) comparing the activity of the polypeptide of claim 1 in the presence of the test 
compound with the activity of the polypeptide of claim 1 in the absence of the test 
compound, wherein a change in the activity of the polypeptide of claim 1 in the 
presence of the test compound is indicative of a compound that modulates the activity 
of the polypeptide of claim 1. 

28. A method of screening a conqxmnd for effectiveness in altering expression of a target 
polynucleotide, wherein said target polynucleotide comprises a sequence of claim 5, the method 
comprising: 

a) exposing a sample comprising the target polynucleotide to a compound, under 
conditions suitable for the expression of the target polynucleotide, 

b) detecting altered expression of the target polynucleotide, and 

c) comparing the expression of the target polynucleotide in the presence of varying 
amounts of the compound and in the absence of the compound. 

29. A method of assessing toxicity of a test compound, the method comprising: 

a) treating a biological sample containing nucleic acids with the test compound, 

b) hybridizing the nucleic acids of the treated biological sample with a probe comprising 
at least 20 contiguous nucleotides of a polynucleotide of claim 12 under conditions 
whereby a specific hybridization complex is formed between said probe and a target 
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polynucleotide in the biological sample, said target polynucleotide comprising a 
polynucleotide sequence of a polynucleotide of claim 12 or fragment thereof, 

c) quantifying the amount of hybridization complex, and 

d) comparing the amount of hybridization complex in the treated biological sample with 
5 the amount of hybridization complex in an untreated biological sample, wherein a 

difference in the amount of hybridization complex in the treated biological sample is 
indicative of toxicity of the test compound. 

30. A diagnostic test for a condition or disease associated with the expression of TRICH in a 
10 biological sample, the method comprising: 

a) combining the biological sample with an antibody of claim 1 1, under conditions 
suitable for the antibody to bind the polypeptide and form an antibody ipolypeptide 
complex, and 

b) detecting the complex, wherein the presence of the complex correlates with the 
15 presence of the polypeptide in the biological sample. 

31. The antibody of claim 11, wherein the antibody is: 

a) a chimeric antibody, 

b) a single chain antibody, 
20 c) a Fab fragment, 

d) a F(ab')2 fragment, or 

e) a humanized antibody. 



25 



32. A composition comprising an antibody of claim 1 1 and an acceptable excipient 

33. A method of diagnosing a condition or disease associated with the expression of TRICH 
in a subject, comprising administering to said subject an effective amount of the composition of claim 
32. 



30 34. A composition of claim 32, wherein the antibody is labeled. 

35. A method of diagnosing a condition or disease associated with the expression of TRICH 
in a subject, comprising administering to said subject an effective amount of the composition of claim 
34. 

35 
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36. A method of preparing a polyclonal antibody with the specificity of the antibody of claim 
1 1, the method comprising: 

a) immunizing an animal with a polypeptide consisting of an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-20, or an immunogenic fragment 

5 thereof, under conditions to elicit an antibody response, 

b) isolating antibodies from said animal, and 

c) screening the isolated antibodies with the polypeptide, thereby identifying a 
polyclonal antibody which binds specifically to a polypeptide comprising an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-20. 

10 

37. A polyclonal antibody produced by a method of claim 36. 

38. A composition comprising the polyclonal antibody of claim 37 and a suitable carrier. 

15 39. A method of making a monoclonal antibody with the specificity of the antibody of claim 

1 1 , the method comprising: 

a) immunizing an animal with a polypeptide consisting of an amino acid sequence 

selected from the group consisting of SEQ ID NO: 1-20, or an immunogenic fragment 
thereof, under conditions to elicit an antibody response, 
20 b) isolating antibody producing cells from the animal, 

c) fusing the antibody producing cells with immortalized cells to form monoclonal 
antibody-producing hybridoma cells, 

d) culturing the hybridoma cells, and 

e) isolating from the culture monoclonal antibody which binds specifically to a 

25 polypeptide comprising an amino acid sequence selected from the group consisting of 

SEQIDNO:1-20. 

40. A monoclonal antibody produced by a method of claim 39. 

30 41. A composition comprising the monoclonal antibody of claim 40 and a suitable carrier. 

42. The antibody of claim 1 1, wherein the antibody is produced by screening a Fab 
expression library. 



35 



43. The antibody of claim 1 1, wherein the antibody is produced by screening a recombinant 
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immunoglobulin library. 

44. A method of detecting a polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ID NO: 1-20 in a sample, the method comprising: 

a) incubating the antibody of claim 1 1 with a sample under conditions to allow specific 
binding of the antibody and the polypeptide, and 

b) detecting specific binding, wherein specific binding indicates the presence of a 
polypeptide comprising an amino acid sequence selected from the group consisting of 
SEQ ID NO: 1-20 in the sample. 



45. A method of purifying a polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ID NO: 1-20 from a sample, the method comprising: 

a) incubating the antibody of claim 1 1 with a sample under conditions to allow specific 
binding of the antibody and the polypeptide, and 
15 b) separating the antibody from the sample and obtaining the purified polypeptide 

comprising an amino acid sequence selected from the group consisting of SEQ ID 
NO: 1-20. 



46. A microarray wherein at least one element of the microarray is a polynucleotide of claim 



20 13. 



47. A method of generating an expression profile of a sample which contains 
polynucleotides, the method comprising: 

a) labeling the polynucleotides of the sample, 
25 b) contacting the elements of the microarray of claim 46 with the labeled 

polynucleotides of the sample under conditions suitable for the formation of a 
* hybridization complex, and 
c) quantifying the expression of the polynucleotides in the sample. 

30 48, An array comprising different nucleotide molecules affixed in distinct physical locations 

on a solid substrate, wherein at least one of said nucleotide molecules comprises a first 
oligonucleotide or polynucleotide sequence specifically hybridizable with at least 30 contiguous 
nucleotides of a target polynucleotide, and wherein said target polynucleotide is a polynucleotide of 
claim 12. 

35 * 
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49. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to at least 30 contiguous nucleotides of said target polynucleotide. 

50. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
5 completely complementary to at least 60 contiguous nucleotides of said target polynucleotide. 

51. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to said target polynucleotide. 

10 52. An array of claim 48, which is a microarray. 

53. An array of claim 48, further comprising said target polynucleotide hybridized to a 
nucleotide molecule comprising said first oligonucleotide or polynucleotide sequence. 

15 54. An array of claim 48, wherein a linker joins at least one of said nucleotide molecules to 

said solid substrate. 

55. An array of claim 48, wherein each distinct physical location on the substrate contains 
multiple nucleotide molecules, and the multiple nucleotide molecules at any single distinct physical 
20 location have the same sequence, and each distinct physical location on the substrate contains 

nucleotide molecules having a sequence which differs from the sequence of nucleotide molecules at 
another distinct physical location on the substrate. 



56. 



A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:l. 



25 



57. 



A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:2. 



58. 



A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:3. 



30 



59. 



A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:4. 



60. 



A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:5. 



61. 



A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:6. 
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62. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:7. 

63. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:8. 

64. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:9. 

65. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:10. 

66. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 11. 

67. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:12. 

68. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:13. 

69. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 14. 

70. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 15. 

71. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:16. 

72. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 17. 

73. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 18. 

74. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 19. 

75. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:20. 

76. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:21. 

77. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:22. 

78. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 
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NO:23. 

79. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:24. 

5 

80. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:25. 

81. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

10 NO:26. 

82. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:27. 

15 83. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:28. 

84. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:29. 

20 

85. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:30. 

86. A polynucleotide of claim 12, conqrising the polynucleotide sequence of SEQ ID 

25 NO:31. 

87. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:32. 

30 88. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:33. 

89. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:34. 

35 
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90. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:35. 

91. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

5 NO:36. 

92. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:37. 

10 93. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:38. 

94. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:39. 

15 

95. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:40. 
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<110> INCYTE GENOMICS, INC. 
TANG, Y. Tom 
YUE, Henry 
NGUYEN, Danniel B. 
HAFALIA, April J. A. 
ELLIOTT, Vicki S. 
LU, Yan 

WALIA, Narinder K, 
YAO, Mohique G. 
BAUGHN, Marian R. 
GANDHI, Ameena R. 
DING, Li 

SANJANWALA, Madhusudan 
RAMKUMAR, Jayalaxmi 
ARVIZU, Chandra 
GIETZEN, Kimberly J. 
LAL, Preeti G. 
AZIMZAI, Yalda 
KHAN, Farrah A. 
THANGAVELU, Kavitha 
THORNTON, Michael 
LTJ, Dyung Aina M. 
TRIBOULEY, Catherine M. 
WARREN, Bridget A. 
ISON, H. Craig 
DAS , Debopriya 
RAUMANN, Brigette E. 
POLICKY, Jennifer L. 
KEARNEY, Liam 

<120> TRANSPORTERS AND ION CHANNELS 

<130> PI-0270 PCT 

<140> To Be Assigned 
<141> Herewith 

<150> 60/243,989; 60/245,904; 60/247,673; 60/249,661; 60/252,232 
60/250,790 

<151> 2000-10-27; 2000-11-03; 2000-11-09; 2000-11-17; 2000-11-20; 
2000-12-01 

<160> 40 

<170> PERL Program 

<210> 1 
<211> 337 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 1626101CD1 

<400> 1 

Met Ser Leu Glu Gin Glu Glu Glu Thr Gin Pro Gly Arg Leu Leu 
15 10 15 

Gly Arg Arg Asp Ala Val Pro Ala Phe lie Glu Pro Asn Val Arg 

20 25 30 

Phe J Trp lie Thr Glu Arg Gin Ser Phe lie Arg Arg Phe Leu Gin 

35 40 45 

Trp Thr Glu Leu Leu Asp Pro Thr Asn Val Phe lie Ser Val Glu 
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50 55 . 60 

Ser lie Glu Asn Ser Arg Gin Leu Leu Cys Thr Asn Glu Asp Val 
65 70 75 

Ser Ser Pro Ala Ser Ala Asp Gin Arg He Gin Glu Ala Trp Lys 
80 85 90 

Arg Ser Leu Ala Thr Val His Pro Asp Ser Ser Asn Leu He Pro 
95 100 105 

Lys Leu Phe Arg Pro Ala Ala Phe Leu Pro Phe Met Ala Pro Thr 

110 115 120 

Val Phe Leu Ser Met Thr Pro Leu Lys Gly He Lys Ser Val He 

125 130 135 

Leu Pro Gin Val Phe Leu Cys Ala Tyr Met Ala Ala Phe Asn Ser 

140 145 150 

He Asn Gly Asn Arg Ser Tyr Thr Cys Lys Pro Leu Glu Arg Ser 

155 160 165 

Leu Leu Met Ala Gly Ala Val Ala Ser Ser Thr Phe Leu Gly Val 

170 175 180 

He Pro Gin Phe Val Gin Met Lys Tyr Gly Leu Thr Gly Pro Trp 

185 190 195 

He Lys Arg Leu Leu Pro Val He Phe Leu Val Gin Ala Ser Gly 

20O 205 210 

Met Asn Val Tyr Met Ser Arg Ser Leu Glu Ser He Lys Gly He 

215 220 225 

Ala Val Met Asp Lys Glu Gly Asn Val Leu Gly His Ser Arg He 

230 235 240 

Ala Gly Thr Lys Ala Val Arg Glu Thr Leu Ala Ser Arg He Val 

245 250 255 

Leu Phe Gly Thr Ser Ala Leu He Pro Glu Val Phe Thr Tyr Phe 

260 265 270 

Phe Lys Arg Thr Gin Tyr Phe Arg Lys Asn Pro Gly Ser Leu Trp 

275 280 285 

He Leu Lys Leu Ser Cys Thr Val Leu Ala Met Gly Leu Met Val 

290 295 300 

Pro Phe Ser Phe Ser He Phe Pro Gin He Gly Gin He Gin Tyr 

305 310 315 

Cys Ser Leu Glu Glu Lys He Gin Ser Pro Thr Glu Glu Thr Glu 

320 325 330 

He Phe Tyr His Arg Gly Val 

335 

<210> 2 

<211> 816 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 2907828CD1 

<400> 2 

Met Ala Val Ser Leu Asp Asp Asp Val Pro Leu He Leu Thr Leu 
1 5 10 15 

Asp Glu Gly Gly Ser Ala Pro Leu Ala Pro Ser Asn Gly Leu Gly 

20 25 30 

Gin Glu Glu Leu Pro Ser Lys Asn Gly Gly Ser Tyr Ala He His 

35 40 45 

Asp Ser Gin Ala Pro Ser Leu Ser Ser Gly Gly Glu Ser Ser Pro 

50 55 60 

Ser Ser Pro Ala His Asn Trp Glu Met Asn Tyr Gin Glu Ala Ala 

65 70 75 

He Tyr Leu Gin Glu Gly Glu Asn Asn Asp Lys Phe Phe Thr His 

80 85 90 

Pro Lys Asp Ala Lys Ala Leu Ala Ala Tyr Leu Phe Ala His Asn 
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m v 

VjXV 
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Asn 


Phe 


Phe 
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Lys 


His 


Val Pro Trp Ser Tyr Leu 






470 










475 480 


Vj^I 


PVio 

IT 11c 


Leu 


Thr 


lie 


xyx 


Glv 


Val 


Glu 


Leu Phe Leu Lys Val Ala 










485 








490 495 


Gly 


Leu 


Gly 


Pro 


Val 


Glu 


Tyr 


Leu 


Ser 


Ser Gly Trp Asn Leu Phe 








500 










505 510 


Asp 


Phe 


Ser 


Val 


Thr 


val 


Phe 


Ala 


Phe 


Leu Gly Leu Leu Ala Leu 








515 










520 525 


Ala 


Leu 


Asn 


Met 


Glu 


Pro 


Phe 


Tyr 


Phe 


lie Val Val Leu Arg Pro 










530 










535 540 


Leu 


Gin 


Leu 


Leu 


Arg 


Leu 


Phe 


Lys 


Leu 


Lys Glu Arg Tyr Arg Asn 










545 










550 555 


Val 


Leu 


Asp 


Thr 


Met 


Phe 


Glu 


Leu 


Leu 


Pro Arg Met Ala Ser Leu 



560 565 570 
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Gly Leu Thr Leu Leu lie Phe Tyr Tyr Ser Phe Ala He Val Gly 

575 580 585 

Met Glu Phe Phe Cys Gly He Val Phe Pro Asn Cys Cys Asn Thr 

590 595 600 

Ser Thr Val Ala Asp Ala Tyr Arg Trp Arg Asn His Thr Val Gly 

605 610 615 

Asn Arg Thr Val Val Glu Glu Gly Tyr Tyr Tyr Leu Asn Asn Phe 

620 625 630 

Asp Asn He Leu Asn Ser Phe Val Thr Leu Phe Glu Leu Thr Val 

635 640 645 

Val Asn Asn Trp Tyr He He Met Glu Gly Val Thr Ser Gin Thr 

650 655 660 

Ser His Trp Ser Arg Leu Tyr Phe Met Thr Phe Tyr He Val Thr 

665 670 675 

Met Val Val Met Thr He He Val Ala Phe He Leu Glu Ala Phe 

680 685 690 

Val Phe Arg Met Asn Tyr Ser Arg Lys Asn Gin Asp Ser Glu Val 

695 700 705 

Asp Gly Gly He Thr Leu Glu Lys Glu He Ser Lys Glu Glu Leu 

710 4 715 720 

Val Ala Val Leu Glu Leu Tyr Arg Glu Ala Arg Gly Ala Ser Ser 

725 730 735 

Asp Val Thr Arg Leu Leu Glu Thr Leu Ser Gin Met Glu Arg Tyr 

740 745 750 

Gin Gin His Ser Met Val Phe Leu Gly Arg Arg Ser Arg Thr Lys 

755 760 765 

Ser Asp Leu Ser Leu Lys Met Tyr Gin Glu Glu He Gin Glu Trp 

770 775 780 

Tyr Glu Glu His Ala Arg Glu Gin Glu Gin Gin Arg Gin Leu Ser 

785 790 795 

Ser Ser Ala Ala Pro Ala Ala Gin Gin Pro Pro Gly Ser Arg Gin 

800 805 810 

Arg Ser Gin Thr Val Thr 

815 

<210> 3 
<211> 1047 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 3968527CD1 

<400> 3 

Met Thr Asp Asn He Pro Leu Gin Pro Val Arg Gin Lys Lys Arg 
15 10 15 

Met Asp Ser Arg Pro Arg Ala Gly Cys Cys Glu Trp Leu Arg Cys 

20 * 25 30 

Cys Gly Gly Gly Glu Ala Arg Pro Arg Thr Val Trp Leu Gly His 

35 40 45 

Pro Glu Lys Arg Asp Gin Arg Tyr Pro Arg Asn Val He Asn Asn 

50 55 60 

Gin Lys Tyr Asn Phe Phe Thr Phe Leu Pro Gly Val Leu Phe Asn 

65 70 75 

Gin Phe Lys Tyr Phe Phe Asn Leu Tyr Phe Leu Leu Leu Ala Cys 

80 85 90 

Ser Gin Phe Val Pro Glu Met Arg Leu Gly Ala Leu Tyr Thr Tyr 

95 100 105 

Trp Val Pro Leu Gly Phe Val Leu Ala Val Thr. Val He Arg Glu 
110 115 120 

Ala Val Glu Glu He Arg Cys Tyr Val Arg Asp Lys Glu Val Asn 
125 130 135 
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Ser Gin Val Tyr Ser Arg Leu Thr Ala Arg Gly Thr Val Lys Val 

140 145 150 

Lys Ser Ser Asn lie Gin Val Gly Asp Leu lie lie Val Glu Lys 

155 160 165 

Asn Gin Arg Val Pro Ala Asp Met lie Phe Leu Arg Thr Ser Glu 

170 175 180 

Lys Asn Gly Ser Cys Phe Leu Arg Thr Asp Gin Leu Asp Gly Glu 

185 190 195 

Thr Asp Trp Lys Leu Arg Leu Pro Val Ala Cys Thr Gin Arg Leu 

200 205 210 

Pro Thr Ala Ala Asp Leu Leu Gin lie Arg Ser Tyr Val Tyr Ala 

215 220 225 

Glu Glu Pro Asn lie Asp lie His Asn Phe Val Gly Thr Phe Thr 

230 235 240 

Arg Glu Asp Ser Asp Pro Pro lie Ser Glu Ser Leu Ser lie Glu 

245 250 255 

Asn Thr Leu Trp Ala Gly Thr Val Val Ala Ser Gly Thr Val Val 

260 265 270 

Gly Val Val Leu Tyr Thr Gly Arg Glu Leu Arg Ser Val Met Asn 

275 280 285 

Thr Ser Asn Pro Arg Ser Lys lie Gly Leu Phe Asp Leu Glu Val 

290 295 300 

Asn Cys Leu Thr Lys lie Leu Phe Gly Ala Leu Val Val Val Ser 

305 310 315 

Leu Val Met Val Ala Leu Gin His Phe Ala Gly Arg Trp Tyr Leu 

320 325 330 

Gin He He Arg Phe Leu Leu Leu Phe Ser Asn He He Pro He 

335 340 345 

Ser Leu Arg Val Asn Leu Asp Met Gly Lys He Val Tyr Ser Trp 

350 355 360 

Val He Arg Arg Asp Ser Lys He Pro Gly Thr Val Val Arg Ser 

365 370 375 

Ser Thr He Pro Glu Gin Leu Gly Arg He Ser Tyr Leu Leu Thr 

380 385 390 

Asp Lys Thr Gly Thr Leu Thr Gin Asn Glu Met He Phe Lys Arg 

395 400 405 

Leu His Leu Gly Thr Val Ala Tyr Gly Leu Asp Ser Met Asp Glu 

410 415 420 

Val Gin Ser His He Phe Ser He Tyr Thr Gin Gin Ser Gin Asp 

425 430 435 

Pro Pro Ala Gin Lys Gly Pro Thr Leu Thr Thr Lys Val Arg Arg 

440 445 450 

Thr Met Ser Ser Arg Val His Glu Ala Val Lys Ala He Ala Leu 

455 460 465 

Cys His Asn Val Thr Pro Val Tyr Glu Ser Asn Gly Val Thr Asp 

470 475 480 

Gin Ala Glu Ala Glu Lys Gin Tyr Glu Asp Ser Cys Arg Val Tyr 

485 490 495 

Gin Ala Ser Ser Pro Asp Glu Val Ala Leu Val Gin Trp Thr Glu 

500 505 510 

Ser Val Gly Leu Thr Leu Val Gly Arg Asp Gin Ser Ser Met Gin 

515 520 525 

Leu Arg Thr Pro Gly Asp Gin He Leu Asn Phe Thr He Leu Gin 

530 535 540 

He Phe Pro Phe Thr Tyr Glu Ser Lys Arg Met Gly He He Val 

545 550 555 

Arg Asp Glu Ser Thr Gly Glu He Thr Phe Tyr Met Lys Gly Ala 

560 565 570 

Asp Val Val Met Ala Gly He Val Gin Tyr Asn Asp Trp Leu Glu 

575 580 585 

Glu Glu Cys Gly Asn Met Ala Arg Glu Gly Leu Arg Val Leu Val 

590 595 . 600 

Val Ala Lys Lys Ser Leu Ala Glu Glu Gin Tyr Gin Asp Phe Glu 
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605 610 615 

Ala Arg Tyr Val Gin Ala Lys Leu Ser Val His Asp Arg Ser Leu 

620 625 630 

Lys Val Ala Thr Val lie GluSer Leu Glu Met Glu Met Glu Leu 

635 640 645 

Leu Cys Leu Thr Gly Val Glu Asp Gin Leu Gin Ala Asp Val Arg 

650 655 660 

Pro Thr Leu Glu Thr Leu Arg Asn Ala Gly lie Lys Val Trp Met 

665 670 675 

Leu Thr Gly Asp Lys Leu Glu Thr Ala Thr Cys Thr Ala Lys Asn 

680 685 690 

Ala His Leu Val Thr Arg Asn Gin Asp lie His Val Phe Arg Leu 

695 700 705 

Val Thr Asn Arg Gly Glu Ala His Leu Glu Leu Asn Ala Phe Arg 

710 715 720 

Arg Lys His Asp Cys Ala Leu Val lie Ser Gly Asp Ser Leu Glu 

725 730 735 

Val Cys Leu Lys Tyr Tyr Glu Tyr Glu Phe Met Glu Leu Ala Cys 

740 745 750 

Gin Cys Pro Ala Val Val Cys Cys Arg Cys Ala Pro Thr Gin Lys 

755 760 765 

Ala Gin He Val Arg Leu Leu Gin Glu Arg Thr Gly Lys Leu Thr 

770 775 780 

Cys Ala Val Gly Asp Gly Gly Asn Asp Val Ser Met lie Gin Glu 

785 790 795 

Ser Asp Cys Gly Val Gly Val Glu Gly Lys Glu Gly Lys Gin Ala 

800 805 810 

Ser Leu Ala Ala Asp Phe Ser He Thr Gin Phe Lys His Leu Gly 

815 820 825 

Arg Leu Leu Met Val His Gly Arg Asn Ser Tyr Lys Arg Ser Ala 

830 835 840 

Ala Leu Ser Gin Phe Val He His Arg Ser Leu Cys lie Ser Thr 

845 850 855 

Met Gin Ala Val Phe Ser Ser Val Phe Tyr Phe Ala Ser Val Pro 

860 865 870 

Leu Tyr Gin Gly Phe Leu He He Gly Tyr Ser Thr He Tyr Thr 

875 880 885 

Met Phe Pro Val Phe Ser Leu Val Leu Asp Lys Asp Val Lys Ser 

890 895 900 

Glu Val Ala Met Leu Tyr Pro Glu Leu Tyr Lys Asp Leu Leu Lys 

905 910 915 

Gly Arg Pro Leu Ser Tyr Lys Thr Phe Leu He Trp Val Leu He 

920 925 930 

Ser He Tyr Gin Gly Ser Thr He Met Tyr Gly Ala Leu Leu Leu 

935 940 945 

Phe Glu Ser Glu Phe Val His He Val Ala He Ser Phe Thr Ser 

950 955 960 

Leu He Leu Thr Glu Leu Leu Met Val Ala Leu Thr He Gin Thr 

965 970 975 

Trp His Trp Leu Met Thr Val Ala Glu Leu Leu Ser Leu Ala Cys 

980 985 990 

Tyr He Ala Ser Leu Val Phe Leu His Glu Phe He Asp Val Tyr 

995 1000 1005 

Phe He Ala Thr Leu Ser Phe Leu Trp Lys Val Ser Val He Thr 
1010 1015 1020 

Leu Val Ser Cys Leu Pro Leu Tyr Val Leu Lys Tyr Leu Arg Arg 
1025 1030 1035 

Arg Phe Ser Pro Pro Ser Tyr Ser Lys Leu Thr Ser 
1040 1045 

<210> 4 
<211> 671 
<212> PRT 
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<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7472732CD1 

<400> 4 

Met Thr Gly Ala Lys Arg Lys Lys Lys Ser Met Leu Trp Ser Lys 
15 10 15 

Met His Thr Pro Gin Cys Glu Asp lie lie Gin Trp Cys Arg Arg 
20 25 30 

Arg Leu Pro He Leu Asp Trp Ala Pro His Tyr Asn Leu Lys Glu 
35 40 45 

Asn Leu Leu Pro Asp Thr Val Ser Gly He Met Leu Ala Val Gin 
50 55 60 

Gin Val Thr Gin Gly Leu Ala Phe Ala Val Leu Ser Ser Val His 
65 70 75 

Pro Val Phe Gly Leu Tyr Gly Ser Leu Phe Pro Ala He He Tyr 
80 85 90 

Ala He Phe Gly Met Gly His His Val Ala Thr Gly Thr Phe Ala 
95 100 105 

Leu Thr Ser Leu He Ser Ala Asn Ala Val Glu Arg He Val Pro 

110 115 120 

Gin Asn Met Gin Asn Leu Thr Thr Gin Ser Asn Thr Ser Val Leu 

125 130 135 

Gly Leu Ser Asp Phe Glu Met Gin Arg He His Val Ala Ala Ala 

140 145 150 

Val Ser Phe Leu Gly Gly Val He Gin Val Ala Met Phe Val Leu 

155 160 165 

Gin Leu Gly Ser Ala Thr Phe Val Val Thr Glu Pro Val He Ser 

170 175 180 

Ala Met Thr Thr Gly Ala Ala Thr His Val Val Thr Ser Gin Val 

185 190 195 

Lys Tyr Leu Leu Gly Met Lys Met Pro Tyr He Ser Gly Pro Leu 

200 205 210 

Gly Phe Phe Tyr He Tyr Ala Tyr Val Phe Glu Asn He Lys Ser 

215 220 225 

Val Arg Leu Glu Ala Leu Leu Leu Ser Leu Leu Ser He Val Val 

230 235 240 

Leu Val Leu Val Lys Glu Leu Asn Glu Gin Phe Lys Arg Lys He 

245 250 255 

Lys Val Val Leu Pro Val Asp Leu Val Leu Ala Pro Asn Thr Ser 

260 265 270 

Pro Leu His His His Tyr Asp Cys Leu Phe Ala Asn Phe Leu Glu 

275 280 285 

Pro Pro Trp Glu Asp Gly Leu Pro Glu Gly Ala Phe Asn Gin Ala 

290 295 300 

Glu Gly His Leu Arg Arg Asn He He Pro Ser Pro Arg Ala Pro 

305 310 315 

Pro Met Asn He Leu Ser Ala Val He Thr Glu Ala Phe Gly Val 

320 325 330 

Ala Leu Val Gly Tyr Val Ala Ser Leu Ala Leu Ala Gin Gly Ser 

335 340 345 

Ala Lys Lys Phe Lys Tyr Ser He Asp Asp Asn Gin Glu Phe Leu 

350 355 360 

Ala His Gly Leu Ser Asn He Val Ser Ser Phe Phe Phe Cys He 

365 370 375 

Pro Ser Ala Ala Ala Met Gly Arg Thr Ala Gly Leu Tyr Ser Thr 

380 385 390 

Gly Ala Lys Thr Gin Val Ala Cys Leu He Ser Cys He Phe Val 

395 400 405 

Leu He Val He Tyr Ala He Gly Pro Leu Leu Tyr Trp Leu Pro 

410 415 420 
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Met Cys Val Leu Ala Ser lie lie Val Val Gly Leu Lys Gly Met 

425 430 435 

Leu He Gin Phe Arg Asp Leu Lys Lys Tyr Trp Asn Val Asp Lys 

440 445 450 

He Asp Trp Gly He Trp Val Ser Thr Tyr Val Phe Thr He Cys 

455 460 465 

Phe Ala Ala Asn Val Gly Leu Leu Phe Gly Val Val Cys Thr He 

470 475 480 

Ala He Val He Gly Arg Phe Pro Arg Ala Met Thr Val Ser He 

485 490 495 

Lys Asn Met Lys Glu Met Glu Phe Lys Val Lys Thr Glu Met Asp 

500 505 510. 

Ser Glu Thr Leu Gin Gin Val Lys He He Ser He Asn Asn Pro 

515 520 525 

Leu Val Phe Leu Asn Ala Lys Lys Phe Tyr Thr Asp Leu Met Asn 

530 535 540 

Met He Gin Lys Glu Asn Ala Cys Asn Gin Pro Leu Asp Asp He 

545 550 555 

Ser Lys Cys Glu Gin Asn Thr Leu Leu Asn Ser Leu Ser Asn Gly 

560 565 570 

Asn Cys Asn Glu Glu Ala Ser Gin Ser Cys Pro Asn Glu Lys Cys 

575 580 585 

Tyr Leu He Leu Asp Cys Ser Gly Phe Thr Phe Phe Asp Tyr Ser 

590 595 600 - 

Gly Val Ser Met Leu Val Glu Val Tyr Met Asp Cys Lys Gly Arg 

605 610 615 

Ser Val Asp Val Leu Leu Ala His Cys Thr Ala Ser Leu He Lys 

620 625 630 

Ala Met Thr Tyr Tyr Gly Asn Leu Asp Ser Glu Lys Pro He Phe 

635 640 645 

Phe Glu Ser Val Ser Ala Ala He Ser His He His Ser Asn Lys 

650 655 660 

Asn Leu Ser Lys Leu Ser Asp His Ser Glu Val 

665 670 

<210> 5 

<211> 671 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7476938CD1 



<400> 5 
















Met 
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Met 


Glu 


Ala 


Gly 


Glu 


Ser Lys Gly He Val Leu 


Ser Ser 


1 








5 






10 


15 


Gly 


Lys 


Gly 


Leu 


His 


Ala 


Ala 


Ser Phe Met Val Glu Gly 


Glu Asn 






20 






25 


30 


Val 


Arg 


Glu 


Gly 


He 


Gly 


Ser 


Glu Met Gly Thr Cys Pro 


Lys Trp 








35 






40 


45 


Thr 


Asn 


Val 


Ser 


His 


Cys 


Lys 


Met Gly He Met Pro Val 


Leu Val 










50 






55 


60 


Lys 


Gly 


Phe 


Val 


Leu 


Ser 


Gly 


Ser Arg Lys Gin Lys Arg 


Val Leu 






65 






70 


75 


Leu 


Ala 


Pro 


Arg 


Leu 


Arg 


Thr 


Arg Trp Ser Trp Lys Leu 


Arg Arg 










80 






85 


90 


Met 


Gly 


Glu 


Lys 


Met 


Ala 


Glu 


Glu Glu Arg Phe Pro Asn 


Thr Thr 








95 






100 


105 


His 


Glu 


Gly 


Phe 


Asn 


Val 


Thr 


Leu His Thr Thr Leu Val 


Val Thr 








110 






115 


120 


Thr 


Lys 


Leu 


Val 


Leu 


Pro 


Thr 


Pro Gly Lys Pro He Leu 


Pro Val 








125 






130 


135 
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Gin 


Thr 


Gly Glu 


Gin 


Ala 


Gin 


Gin 


Glu 


Glu Gin 


Ser Ser Gly Met 






140 










145 




150 


Thar 


lie 


Ph Phe 


Ser 


Leu 


Leu 


Val 


Leu 


Ala He 


Cys lie 


He Leu 








155 










160 




165 


Val 


His 


Leu Leu 


He 


Arg 


Tyr 


Arg 


Leu 


His Phe 


Leu Pro 


Glu Ser 








170 










175 




180 


Val 


Ala 


Val Val 


Ser 


Leu 


Gly 


He 


Leu 


Met Gly 


Ala Val 


He Lys 








185 










190 




195 


He 


lie 


Glu Phe 


Lys 


Lvs 


Leu 


Ala 


Asn 


Trp Lys 


Glu Glu 


Glu Met 








200 










205 




210 


Phe 




Pro Asn 


Met 


Phe 


Phe 


Leu 


Leu 


Leu Leu 


Pro Pro 


He He 






215 










220 




225 




V3l u 






Ser 


Leu 


His 


Lys 


Gly Asn 


Phe Phe 


Gin Asn 






£i J VJ 










235 




240 


11c 


uiy 


oci 116 


J. IJ-L 


Leu 


Phe 


Ala 


Val 


Phe Gly 


Thr Ala 


He Ser 
















250 




255 


Ala 


rile 


val Val 


f2l vr 


Valjf 


Gly 


He 




Php Leu 


Gly Gin Ala Asp 








960 
^ o iy 










265 




270 


V a._L 


lie 


ocx xjy s 


T,pii 

JJCLl 


Asn 


Met 


Thr 


Asp 


Ser Phe 


Ala Phe Gly Ser 






97R 










280 




285 


lieu 


116 


OcX. nlQ 


Val 


Aero 


Pro 


Val 


Ala 


Thr He 


Ala He 


Phe Asn 








9Q0 










295 




300 


Til s» 
Ala 


T 

Leu 


Jtlla Val 


Asp 


JTJL \J 


Val' Leu 


Asn 


Met Leu 


Val Phe Gly Glu 


















310 




315 


ser 


lie 


T i A on 
juclx noil 




Ala 


Val 


Ser 


He 


Val Leu 


Thr Asn 


Thr Ala 


















325 




330 


ulll 


v>iy 


JjcU 1 IUL 






Asn 


Met 


Ser 


Asr> Val 


Ser Gly Trp Gin 
















340 




345 


lIlX 


trie 


licU WtIIX 


Al « 
Ala 


ucu 


Asp Tyr 


Phe 


Leu Lvs 


Met Phe 


Phe Gly 








J Jy 










355 




360 


oci 


zvl a 

Ala. 


Ala Leu 


nl -w 
uiy 


Thr 


Leu 


Thr 


Glv 


Leu lie 


Ser Ala 


Leu Val 


















370 




375 


Leu. 


Lys 


W-i e Tl o 
nib lie? 


Asp 


XJCU 


Arg 


Lys 


Thr 


Pro Ser 


Leu Glu 


Phe Gly 
















385 




390 


Met 


Diet 


Tl p Tie 
lie lie 


xrxic 


Ala 
ax a 


Tyr 


Leu 


Pro 


Tvr Glv 

XJfi, «XJf 


Leu Ala Glu Gly 


















400 




405 


lie 


Ot2X 


ucu Oci 


v?x_y 


He 


Met 


Ala 


He 


Leu Phe 


Ser Gly He Val 








ftiu 










415 




420 




Ser 


His Tyr 


Thr 


xl J. o 


His 


Asn 


Leu 


Ser Pro 


Val Thr 


Gin He 






A9S 










430 




435 


Leu 




Gin Gin 


J.IUL 


Leu 


Arg 


Thr 


Val 


Ala Phe 


Leu Cys 


Glu Thr 








440 










445 




450 


Cys 


Val 


Phe Ala 

rue nxa 


Phe 


Leu 


Gly Leu 


Ser 


He Phe 


Ser Phe 


Pro His 






455 










460 




465 


T AT'C! 
XJJf O 


xriic 


Glu lie 


Ser 


Phe 


Val 


He 


Tro 


Cys He 


Val Leu 


Val Leu 






470 










475 




480 


Phe 


Civ 


At~ct AT a 


Val 


Asn 


He 


Phe 


Pro 


Leu Ser 


Tyr Leu 


Leu Asn 




485 










490 




495 


Phe 


Phe 


Arrcr Asd 


His 


Lys 


He 


Thr 


Pro 


Lys Met 


Met Phe 


He Met 






500 










505 




510 


Trr> 


Phe 


Ser. Gly 


Leu 


Arg 


Gly Ala 


He 


Pro Tyr 


Ala Leu 


Ser Leu 






515 










520 




525 


His 


Leu 


Asp Leu 


Glu 


Pro 


Met 


Glu 


Lys 


Arg Gin 


Leu He Gly Thr 






530 










535 




540 


Thr 


Thr 


He Val 


He 


Val 


Leu 


Phe 


Thr 


He Leu 


Leu Leu Gly Gly 








545 










550 




555 


Ser 


Thr 


Met Pro 


Leu 


He 


Arg 


Leu 


Met 


Asp He 


Glu Asp Ala Lys 








560 










565 




570 


Ala 


His 


Arg Arg 


Asn 


Lys 


Lys 


Asp 


Val 


Asn Leu 


Ser Lys 


Thr Glu 








575 










580 




585 


Lys 


Met 


Gly Asn 


Thr 


Val 


Glu 


Ser 


Glu 


His Leu 


Ser Glu 


Leu Thr 




590 










595 




600 


Glu 


Glu 


Glu Tyr 


Glu 


Ala 


His 


Tyr 


He 


Arg Arg 


Gin Asp Leu Lys 
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605 610 615 

Gly Phe Val Trp Leu Asp Ala Lys Tyr Leu Asn Pro Phe Phe Thr 

620 625 630 

Arg Arg Leu Thr Gin Glu Asp Leu His His Gly Arg lie Gin Met 

635 640 645 

Lys Thr Leu Thr Asn Lys Trp Tyr Glu Glu Val Arg Gin Gly Pro 

650 655 660 

Ser Gly Ser Glu Asp Asp Glu Gin Glu Leu Leu 

665 670 

<210> 6 

<211> 315 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 8128531CD1 

<400> 6 

Met Thr His Gin Asp Leu Ser lie Thr Ala Lys Leu lie Asn Gly 
1 5 10 15 

Gly Val Ala Gly Leu Val Gly Val Thr Cys Val Phe Pro He Asp 
20 25 30 

Leu Ala Lys Thr Arg Leu Gin Asn Gin His Gly Lys Ala Met Tyr 
35 40 45 

Lys Gly Met He Asp Cys Leu Met Lys Thr Ala Arg Ala Glu Gly 
50 55 60 

Phe Phe Gly Met Tyr Arg Gly Ala Ala Val Asn Leu Thr Leu Val 
65 70 75 

Thr Pro Glu Lys Ala He Lys Leu Ala Ala Asn Asp Phe Phe Arg 
80 85 90 

Arg Leu Leu Met Glu Asp Gly Met Gin Arg Asn Leu Lys Met Glu 
95 100 105 

Met Leu Ala Gly Cys Gly Ala Gly Met Cys Gin Val Val Val Thr 

110 115 120 

Cys Pro Met Glu Met Leu Lys He Gin Leu Gin Asp Ala Gly Arg 

125 130 135 

Leu Ala Val His His Gin Gly Ser Ala Ser Ala Pro Ser Thr Ser 

140 145 150 

Arg Ser Tyr Thr Thr Gly Ser Ala Ser Thr His Arg Arg Pro Ser 

155 160 165 

Ala Thr Leu He Ala Trp Glu Leu Leu Arg Thr Gin Gly Leu Ala 

170 175 180 

Gly Leu Tyr Arg Gly Leu Gly Ala Thr Leu Leu Arg Asp He Pro 

185 190 195 

Phe Ser He He Tyr Phe Pro Leu Phe Ala Asn Leu Asn Asn Leu 

200 205 210 

Gly Phe Asn Glu Leu Ala Gly Lys Ala Ser Phe Ala His Ser Phe 

215 220 225 

Val Ser Gly Cys Val Ala Gly Ser lie Ala Ala Val Ala Val Thr 

230 235 240 

Pro Leu Asp Val Leu Lys Thr Arg He Gin Thr Leu Lys Lys Gly 

245 250 255 

Leu Gly Glu Asp Met Tyr Ser Gly He Thr Asp Cys Ala Arg Lys 

260 265 270 

Leu Trp He Gin Glu Gly Pro Ser Ala Phe Met Lys Gly Ala Gly 

275 280 285 

Cys Arg Ala Leu Val He Ala Pro Leu Phe Gly He Ala Gin Gly 

290 295 300 

Val Tyr Phe He Gly He Gly Glu Arg He Leu Lys Cys Phe Asp 

305 310 315 
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<210> 7 
<211> 445 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7476757CD1 

<400> 7 

Met Pro Trp Val Leu Gly Cys Thr Pro Phe He Ala Leu Ala Tyr 
15 10 15 

Phe Phe Leu Trp Phe Leu Pro Pro Phe Thr Ser Leu Arg Gly Leu 
20 25 30 

Trp Tyr Thr Thr Phe Tyr Cys Leu Phe Gin Ala Leu Ala Thr Phe 
35 40 45 

Phe Gin Val Pro Tyr Thr Ala Leu Thr Met Leu Leu Thr Pro Cys 
50 55 60 

Pro Arg Glu Arg Asp Ser Ala Thr Ala He Pro Asp Asp Cys Gly 
65 70 75 

Asp Gly Gly Asn Thr Asp Gly Gly His Cys Pro Arg Ala His Arg 
80 85 90 

Val Arg Arg Pro Gin Thr Pro Gin Val Arg Gly His Cys Asp Pro 
95 100 105 

Gly Ala Ser His Cys Leu Pro Glu Cys Ser His Leu Tyr Cys He 

110 115 120 

Ala Ala Ala Val Val Val Val Thr Tyr Pro Val Cys He Ser Leu 

125 130 135 

Leu Cys Leu Gly Val Lys Glu Arg Pro Gly Phe Ala Phe Glu Leu 

140 145 150 

Cys Glu Ma Lys Val Thr Arg Phe Cys Val Ala Asp Pro Ser Ala 

155 160 165 

Pro Ala Ser Gly Pro Gly Leu Ser Phe Leu Ala Gly Leu Ser Leu 

170 175 180 

Thr Thr Arg His Pro Pro Tyr Leu Lys Leu Val He Ser Phe Leu 

185 190 195 

Phe He Ser Ala Ala Val Gin Val Glu Gin Ser Tyr Leu Val Leu 

200 205 210 

Phe Cys Thr His Ala Ser Gin Leu His Asp His Val Gin Gly Leu 

215 220 225 

Val Ser Ala Val Leu Ser Thr Pro Leu Trp Glu Trp Val Leu Gin 

230 235 240 

Arg Phe Gly Lys Lys Thr Ser Ala Phe Gly He Phe Ala Met Val 

245 250 255 

Pro Phe Ala He Leu Leu Ala Ala Val Pro Thr Ala Pro Val Ala 

260 265 270 

Tyr Val Val Ala Phe Val Ser Gly Val Ser He Ala Val Ser Leu 

275 280 285 

Leu Leu Pro Trp Ser Met Leu Pro Asp Val Val Asp Asp Phe Gin 

290 295 300 

Leu Gin His Arg His Gly Pro Gly Leu Glu Thr He Phe Tyr Ser 

305 310 315 

Ser Tyr Val Phe Phe Thr Lys Leu Ser Gly Ala Cys Ala Leu Gly 

320 325 330 

He Ser Thr Leu Ser Leu Glu Phe Ser Gly Tyr Lys Ala Gly Val 

335 340 345 

Cys Lys Gin Ala Glu Glu Val Val Val Thr Leu Lys Val Leu He 

350 355 360 

Gly Ala Val Pro Thr Cys Met He Leu Ala Gly Leu Cys He Leu 

365 370 375 

Met Val Gly Ser Thr Pro Lys Thr Pro Ser Arg Asp Ala Ser Ser 

380 385 390 

Arg Leu Ser Leu Arg Arg Arg Ala Gin Ala Pro Asn Val His Thr 
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395 

Ser Lys Val His Glu His Ala His 
410 

Gin Ala Val Gly Gly Leu Val lie 
425 

Thr Ala Ser Gly Ser Ala Ala Glu 
440 

<210> 8 
<211> 410 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 266243CD1 

<400> 8 



Met 


Ala 


Ala 


Ala 


Ala 


Val 


Gly 


Ala 


Gly His Gly Ala Gly Gly Pro 


1 








5 








10 15 


Gly 


Ala 


Ala 


Ser 


Ser 


Ser 


Gly 


Gly 


Ala Arg Glu Gly Ala Arg Val 










20 








25 30 


Ala 


Ala 


Leu 


Cys 


Leu 


Leu 


Trp 


Tyr 


Ala Leu Ser Ala Gly Gly Asn 








35 








40 45 


Val 


Val 


Asn 


Lys 


Val 


He 


Leu 


Ser 


Ala Phe Pro Phe Pro Val Thr 








50 








55 60 


Val 


Ser 


Leu 


Cys 


His 


He 


Leu 


Ala 


lieu Cys Ala Gly Leu Pro Pro 








65 








70 75 


Leu 


Leu 


Arg 


Ala 


Trp 


Arg 


Val 


Pro 


Pro Ala Pro Pro Val Ser Gly 










80 








85 90 


Pro 


Gly 


Pro 


Ser 


Pro 


His 


Pro 


Ser 


Ser Gly Pro Leu Leu Pro Pro 








95 








100 105 


Arg 


Phe 


Tyr 


Pro 


Arg 


Tyr 


Val 


Leu 


Pro Leu Ala Phe Gly Lys Tyr 










110 








115 120 


Phe 


Ala 


Ser 


Val 


Ser 


Ala 


His 


Val 


Ser He Trp Lys Val Pro Val 










125 








ion "IOC 

130 13o 


Ser 


Tyr 


Ala 




inr 


V ct-L 


Lys 




rnVjr- M*=>t- Pro Tit* Tm Val Val 










140 








145 150 


Leu 


Leu 


Ser 


Arg 


He 


He 


Met 


Lys 


Glu Lys Gin Ser Thr Lys Val 










155 








160 165 


Tyr 


Leu 


Ser 


Leu 


lie 


Pro 


He 


He 


Ser Gly Val Leu Leu Ala Thr 








170 








175 180 


Val 


Thr 


Glu 


Leu 


Ser 


Phe 


Asp 


Met 


Trp Gly Leu Val Ser Ala Leu 










185 








190 195 


Ala 


Ala 


Thr 


Leu 


Cys 


Phe 


Ser 


Leu 


Gin Asn He Phe Ser Lys Lys 










200 








205 210 


Val 


Leu 


Arg 


Asp 


Ser 


Arg 


He 


His 


His Leu Arg Leu Leu Asn He 










215 








220 225 


Leu 


Gly 


Cys 


His 


Ala 


Val 


Phe 


Phe 


Met He Pro Thr Trp Val Leu 








230 








235 240 


Val 


Asp 


Leu 


Ser 


Ala 


Phe 


Leu 


Val 


Ser Ser Asp Leu Thr Tyr Val 










245 








250 255 


Tyr 


Gin 


Trp 


Pro 


Trp 


Thr 


Leu 


Leu 


Leu Leu Ala Val Ser Gly Phe 








260 








265 270 


Cys 


Asn 


Phe 


Ala 


Gin 


Asn 


Val 


He 


Ala Phe Ser He Leu Asn Leu 








275 








280 285 


Val 


Ser 


Pro 


Leu 


Ser 


Tyr 


Ser 


Val 


Ala Asn Ala Thr Lys Arg He 










290 








295 300 


Met 


Val 


He 


Thr 


Val 


Ser 


Leu 


He 


Met Leu Arg Asn Pro Val Thr 










305 








310 315 


Ser 


Thr 


Asn 


Val 


Leu 


Gly 


Met 


Met 


Thr Ala He Leu Gly Val Phe 










320 








325 330 


Leu 


Tyr 


Asn 


Lys 


Thr 


Lys 


Tyr 


Asp 


Ala Asn Gin Gin Ala Arg Lys 



400 405 

He Met Gin Ala His Ala Gly 

415 420 
Ser His Ser Leu Leu Arg Val 

430 435 
Arg Tyr 

445 
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335 340 345 

His Leu Leu Pro Val Thr Thr Ala Asp Leu Ser Ser Lys Glu Arg 

350 355 360 

His Arg Ser Pro Leu Glu Lys Pro His Asn Gly Leu Leu Phe Pro 

365 370 375 

Gin His Gly Asp Tyr Gin Tyr Gly Arg Asn Asn lie Leu Thr Asp 

380 385 390 

His Phe Gin Tyr Ser Arg Gin Ser Tyr Pro Asn Ser Tyr Ser Leu 

395 400 405 

Asn Arg Tyr Asp Val 

410 



<210> 9 
<211> 374 
<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 6585710CD1 



<400> 9 

Met Val His Tyr Phe Thr Ala He Gly Tyr Pro Cys Pro Arg Tyr 
15 10 15 

Ser Asn Pro Ala Asp Phe Tyr Val Asp Leu Thr Ser He Asp Arg 
20 25 30 

Arg Ser Arg Glu Gin Glu Leu Ala Thr Arg Glu Lys Ala Gin Ser 
35 40 45 

Leu Ala Ala Leu Phe Leu Glu Lys Val Arg Asp Leu Asp Asp Phe 
50 55 60 

Leu Trp Lys Ala Glu Thr Lys Asp Leu Asp Glu Asp Thr Cys Val 
65 70 75 

Glu Ser Ser Val Thr Pro Leu Asp Thr Asn Cys Leu Pro Ser Pro 
80 85 90 

Thr Lys Met Pro Gly Ala Val Gin Gin Phe Thr Thr Leu He Arg 
95 100 105 

Arg Gin He Ser Asn Asp Phe Arg Asp Leu Pro Thr Leu Leu He 

110 115 120 

His Gly Ala Glu Ala Cys Leu Met Ser Met Thr He Gly Phe Leu 

125 130 135 

Tyr Phe Gly His Gly Ser He Gin Leu Ser Phe Met Asp Thr Ala 

140 145 150 

Ala Leu Leu Phe Met He Gly Ala Leu He Pro Phe Asn Val He 

155 160 165 

Leu Asp Val He Ser Lys Cys Tyr Ser Glu Arg Ala Met Leu Tyr 

170 175 180 

Tyr Glu Leu Glu Asp Gly Leu Tyr Thr Thr Gly Pro Tyr Phe Phe 

185 . . 190 195 

Ala Lys He Leu Gly Glu Leu Pro Glu His Cys Ala Tyr He He 

200 205 210 

He Tyr Gly Met Pro Thr Tyr Trp Leu Ala Asn Leu Arg Pro Gly 

215 220 225 

Leu Gin Pro Phe Leu Leu His Phe Leu Leu Val Trp Leu Val Val 

230 235 240 

Phe Cys Cys Arg He Met Ala Leu Ala Ala Ala Ala Leu Leu Pro 

245 250 255 

Thr Phe His Met Ala Ser Phe Phe Ser Asn Ala Leu Tyr Asn Ser 

260 265 270 

Phe Tyr Leu Ala Gly Gly Phe Met He Asn Leu Ser Ser Leu Trp 

275 280 285 

Thr Val Pro Ala Trp He Ser Lys Val Ser Phe Leu Arg Trp Cys 

290 295 300 

Phe Glu Gly Leu Met Lys He Gin Phe Ser Arg Arg Thr Tyr Lys 
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305 






310 




315 


Met 


Pro 


Leu Gly 


Asn 


Leu 


Thr 


He Ala Val 


Ser Gly Asp Lys 


lie 








320 






325 




330 


Leu 


Ser Ala Met 


Glu 


Leu 


Asp 


Ser Tyr Pro 


Leu Tyr Ala He 


Tyr 








335 






340 




345 


Leu 


lie 


Val lie 


Gly Leu 


Ser Gly Gly Phe 


Met Val Leu Tyr Tyr 








350 






355 




360 


Val 


Ser 


Leu Arg 


Phe 


He 


Lys 


Gin Lys Pro 


Ser Gin Asp Trp 










365 






370 







<210> 10 

<211> 443 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7483599CD1 

<400> 10 



Met 


Asp 


Lys 


Phe 


Leu Asp 


Thr 


Tyr Asn Leu 


Pro Arg Leu Asn Gin 


1 








5 






10 




15 


Glu 


Glu 


He 


Gin 


Asn 


Leu 


Lys 


Arg Pro He 


Thr 


Ser Asn Glu He 










20 






25 




30 


Lys 


Ala 


He 


He 


Lys 


Ser 


Leu 


Gin Met Ser 


Leu Leu Gly Arg Asp 








35 






40 




45 


Tyr 


Asn 


Ser 


Glu 


Leu 


Asn 


Ser 


Leu Asp Asn 


Gly Pro Gin Ser Pro 








50 






55 




60 


Ser 


Glu 


Ser 


Ser 


Ser 


Ser 


He 


Thr Ser Glu 


Asn Val His Pro Ala 










65 






70 




75 


Gly 


Glu 


Ala 


Gly 


Leu 


Ser 


Met 


Met Gin Thr 


Leu 


He His Leu Leu 








80 






85 




90 


Lys 


Cys 


Asn 


He 


Gly Thr 


Gly 


Leu Leu Gly 


Leu 


Pro Leu Ala He 








95 






100 




105 


Lys 


Asn 


Ala 


Gly 


Leu 


Leu 


Val 


Gly Pro Val 


Ser 


Leu Leu Ala He 








110 






115 




120 


Gly 


Val 


Leu 


Thr 


Val 


His 


Cys 


Met Val He 


Leu 


Leu Asn Cys Ala 








125 






130 




135 


Gin 


His 


Leu 


Ser 


Gin 


Pro 


Arg 


Leu Gin Lys 


Thr 


Phe Val Asn Tyr 










140 






145 




150 


Gly 


Glu 


Ala 


Thr 


Met 


Tyr 


Gly 


Leu Glu Thr 


Cys 


Pro Asn Thr Trp 








155 






160 




165 


Leu 


Arg 


Ala 


His 


Ala 


Val 


Trp 


Gly Arg Tyr 


Thr Val Ser Phe Leu 










170 






175 




180 


Leu 


Val 


He 


Thr 


Gin 


Leu 


Gly 


Phe Cys Ser 


Val 


Tyr Phe Met Phe 










185 






190 




195 


Met 


Ala 


Asp 


Asn 


Leu 


Gin 


Gin 


Met Val Glu 


Lys Ala His Val Thr 








200 






205 




210 


Ser 


Asn 


He 


Cys 


Gin 


Pro 


Arg 


Glu He Leu 


Thr 


Leu Thr Pro He 










215 






220 




225 


Leu 


Asp 


He 


Arg 


Phe 


Tyr 


Met 


Leu He He 


Leu 


Pro Phe Leu He 






230 






235 




240 


Leu 


Leu 


Val 


Phe 


He 


Gin 


Asn 


Leu Lys Val 


Leu 


Ser Val Phe Ser 










245 






250 




255 


Thr 


Leu 


Ala 


Asn 


He 


Thr 


Thr 


Leu Gly Ser 


Met 


Ala Leu He Phe 










260 






265 




270 


Glu 


Tyr 


He 


Met 


Glu Gly 


He 


Pro Tyr Pro 


Ser 


Asn Leu Pro Leu 








275 






280 




285 


Met 


Ala 


Asn 


Trp 


Lys 


Thr 


Phe 


Leu Leu Phe 


Phe Gly Thr Ala He 










290 






295 




300 


Phe 


Thr 


Phe 


Glu 


Gly Val 


Gly 


Met Val Leu 


Pro 


Leu Lys Asn Gin 










305 






310 




315 


Met 


Lys 


His 


Pro 


Gin 


Gin 


Phe 


Ser Phe Val 


Leu Tyr Leu Gly Met 
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320 










325 


330 


Ser 


He 


Val 


He He 


Leu 


Tyr 


He 


Leu 


Leu 


Gly Thr Leu Gly Tyr 








335 










340 


345 


Met 


Lys 


Phe 


Gly Ser 


Asp 


Thr 


Gin Ala 


Ser 


He Thr Leu Asn Leu 






350 










355 


360 


Pro 


Asn 


Cvs 


Trp Tyr 


Val 


Leu 


Pro 


Thr 


Ser 


Gly Glu He Gly Arg 






365 










370 


375 


Asp 


Thr 


Gly 


Thr Val 


Leu 


Val 


Val 


He 


Ala 


Glu Ser Thr Ala Lys 




380 










385 


390 


Leu 


Ser 


His 


Glu Ala 


Gly 


Asn 


Pro 


Ser 


Leu 


Glu Val Thr Tyr Val 








395 










400 


405 


Ser 


Pro 


Ala 


His Thr 


Ala 


Ser 


Val 


Lys Ala 


Ser His Met Ala Ala 








410 










415 


420 


Pro 


His 


Ser 


Lys Gly 


Ala 


Gly 


Lys 


Cys 


Asn 


Ser Ala Met Cys Leu 








425 










430 


435 


Glu 


Val 


Phe 


Gly Glu 


Gin 


His 


Lys 









440 

<210> 11 

<211> 321 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 2507246CD1 

<400> 11 



Met 


Ala 


Thr 


Gly 


Gly 


Gin 


Gin 


Lys Glu Asn 


Thr Leu Leu His Leu 


1 






5 






10 


15 


Phe 


Ala 


Gly 


Gly 


Cys 


Gly 


Gly 


Thr Val Gly 


Ala He Phe Thr Cys 










20 






25 


30 


Pro 


Leu 


Glu 


Val 


He 


Lys 


Thr 


Arg Leu Gin 


Ser Ser Arg Leu Ala 










35 






40 


45 


Leu 


Arg 


Thr 


Val 


Tyr 


Tyr 


Pro 


Gin Val His 


Leu Gly Thr He Ser 








50 






55 


60 


Gly 


Ala 


Gly 


Met 


Val 


Arg 


Pro 


Thr. Ser Val 


Thr Pro Gly Leu Phe 








65 






70 


75 


Gin 


Val 


Leu 


Lys 


Ser 


He 


Leu 


Glu Lys Glu 


Gly Pro Lys Ser Leu 








80 






85 


90 


Phe 


Arg 


Gly 


Leu 


Gly 


Pro 


Asn 


Leu Val Gly 


Val Ala Pro Ser Arg 








95 






100 


105 


Ala 


Val 


Tyr 


Phe 


Ala 


Cys 


Tyr 


Ser Lys Ala 


Lys Glu Gin Phe Asn 










110 






115 


120 


Gly 


He 


Phe 


Val 


Pro 


Asn 


Ser 


Asn He Val 


His He Phe Ser Ala 








125 






130 


135 


Gly 


Ser 


Ala 


Ala 


Phe 


He 


Thr 


Asn Ser Leu 


Met Asn Pro He Trp 








140 






145 


150 


Met 


Val 


Lys 


Thr 


Arg 


Met 


Gin 


Leu Glu Gin 


Lys Val Arg Gly Ser 










155 






160 


165 


Lys 


Gin 


Met 


Asn 


Thr 


Leu 


Gin 


Cys Ala Arg 


Tyr Val Tyr Gin Thr 








170 






175 


180 


Glu 


Gly 


He 


Arg 


Gly 


Phe 


Tyr 


Arg Gly Leu 


Thr Ala Ser Tyr Ala 










185 






190 


195 


Gly 


He 


Ser 


Glu 


Thr 


He 


He 


Cys Phe Ala 


He Tyr Glu Ser Leu 








200 






205 


210 


Lys 


Lys 


Tyr 


Leu 


Lys 


Glu 


Ala 


Pro Leu Ala 


Ser Ser Ala Asn Gly 








215 






220 


225 


Thr 


Glu 


Lys 


Asn 


Ser 


Thr 


Ser 


Phe Phe Gly 


Leu Met Ala Ala Ala 










230 






235 


240 


Ala 


Leu 


Ser 


Lys 


Gly 


Cys 


Ala 


Ser Cys He 


Ala Tyr Pro His Glu 










245 






250 


255 


Val 


He 


Arg 


Thr 


Arg 


Leu 


Arg 


Glu Glu Gly 


Thr Lys Tyr Lys Ser 
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260 

Phe Val Gin Thr Ala 
275 

Ala Phe Tyr Arg Gly 
290 

Asn Thr Ala lie Val 
305 

Leu Glu Asp Arg Thr 
320 

<210> 12 
<211> 487 
<212> PRT 
<213> Homo sapiens 

<220> 

<221> misc_feature 
<223> Incyte ID No: 3033505CD1 

<400> 12 



wee 


wet 


HIS 


£*ne 


Lys 


Ser 


Vj-Ly 


Leu Glu 


Leu 


inr xieu vjj-ii 


JtXQll 


X 








D 








10 




1*5 


Met 


Tnr 


vax 


Pro 


r»i ii 


Asp 


Asp 


Asn He 


Ser 


~Ty e-tn Ion Coy "7\ C2T> 

ASD, ASp Oci noil 


Asp 










on 








25 






irlie 


inr 




vax 


l^XU 


Asn 


biy 


Gin He 


Asn 


oci XiyS Jrilc lie 


Cor 

OCX 


















40 






Asp 


Arg 




Ser 


Arg 


Arg 


Ser 


Leu Thr 


Asn 


Del X1.L£> JJCU UlU 


xiy 0 










so 








55 




60 


Lys 


Lys 




Asp 


V7.LU 


Tyr 


-L -L t2 


Pro Gly Thr 


' ' ' ' i_r~ 1 ■ JJCLi. VJO. Jf 


Met 










65 








70 




75 


Ser 


Val 


Phe 


Asn 


Leu 


Ser 


Asn 


Ala He 


Met 


Glv Ser Glv He 


Leu 










80 








85 




90 


Gly 


Leu 


Ala 


Phe 


Ala 


Leu 


Ala 


Asn Thr Gly 


lie Leu Leu Phe 


Leu 










95 








100 




105 


Val 


Leu 


Leu 


Thr 


Ser 


Val 


Thr 


Leu Leu 


Ser 


He Tyr Ser He 


Asn 










110 








115 




120 


Leu 


Leu 


Leu 


He 


Cys 


Ser 


Lys 


Glu Thr Gly 


Cys Met Val Tyr 


Glu 










125 








130 




135 


Lys 


Leu 


Gly 


Glu 


Gin 


Val 


Phe 


Gly Thr Thr 


Gly Lys Phe Val 


He 










140 








145 




150 


Phe 


Gly 


Ala 


Thr 


Ser 


Leu 


Gin 


Asn Thr Gly 


Ala Met Leu Ser 


Tyr 








155 








160 




165 


Leu 


Phe 


lie 


Val 


Lys 


Asn 


Glu 


Leu Pro 


Ser 


Ala He Lys Phe 


Leu 










170 








175 




180 


Met 


Gly 


Lys 


Glu 


Glu 


Thr 


Phe 


Ser Ala Trp 


Tyr Val Asp Gly 


Arg 










185 








190 




195 


Val 


Leu 


Val 


Val 


He 


Val 


Thr 


Phe Gly He 


He Leu Pro Leu 


Cys 










200 








205 




210 


Leu 


Leu 


Lys 


Asn 


Leu 


Gly 


Tyr 


Leu Gly Tyr 


Thr Ser Gly Phe 


Ser 










215 








220 




225 


Leu 


Ser 


Cys 


Met 


Val 


Phe 


Phe 


Leu He 


Val 


Val He Tyr Lys 


Lys 










230 








235 




240 


Phe 


Gin 


He 


Pro 


Cys 


He 


Val 


Pro Glu 


Leu 


Asn Ser Thr He 


Ser 










245 








250 




255 


Ala 


Asn 


Ser 


Thr 


Asn 


Ala 


Asp 


Thr Cys 


Thr 


Pro Lys Tyr Val 


Thr 










260 








265 




270 


Phe 


Asn 


Ser 


Lys 


Thr 


Val 


Tyr 


Ala Leu 


Pro 


Thr He Ala Phe 


Ala 










275 








280 




285 


Phe 


Val 


Cys 


His 


Pro 


Ser 


Val 


Leu Pro 


He 


Tyr Ser Glu Leu 


Lys 










290 








295 




300 


Asp 


Arg 


Ser 


Gin 


Lys 


Lys 


Met 


Gin Met 


Val 


Ser Asn He Ser 


Phe 










305 








310 




315 


Phe 


Ala 


Met 


Phe 


Val 


Met 


Tyr 


Phe Leu 


Thr 


Ala He Phe Gly 


Tyr 



265 270 
Arg Leu Val Phe Arg Glu Glu Gly Tyr Leu 
280 285 
Leu Phe Ala Gin Leu He Arg Gin He Pro 
295 300 
Leu Ser Thr Tyr Glu Leu He Val Tyr Leu 
310 315 

Gin 
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320 



Leu 


Thr 


Phe 


Tyr Asp 
335 


Asn 


Val 


Gin 


Gin 


Ser 


Lys 


Asp Asp 
350 


He 


Leu 


He 


He 


Val 


Ala 


Val He 
365 


Leu 


Thr 


Val 


Arg 


Ser 


Ser 


Leu Phe 
380 


Glu 


Leu 


Ala 


Cys 


Arg 


His 


Thr Val 
395 


Val 


Thr 


Cys 


Leu 


Leu 


Val 


lie Phe 
410 


He 


Pro 


Ser 


Val 


Gly 


Val 


Thr Ser 
425 


Ala 


Asn 


Met 


Ser 


Leu 


Tyr 


Leu Lys 
440 


He 


Thr 


Asp 


Gin 


Arg 


He 


Trp Ala 
455 


Ala 


Leu 


Phe 


Ser 


Leu 


Val 


Ser He 
470 


Pro 


Leu 


Val 


Ser 


Ser 


Ser 


Asp Glu 


Gly 


His 





485 

<210> 13 

<211> 509* 

<2l2> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 4027693CD1 

<400> 13 



Met 


Glu 


Leu 


Lys 


Lys 


Ser 


Pro 


Asp 


1 








5 








Val 


Phe 


Val 


Ser 


Phe 
20 


Leu 


Thr 


Gin 


Leu 


Ala 


Val 


Gly 


Val 
35 


Leu 


Tyr 


He 


Glu 


Gly 


Lys 


Gly 


Lys 
50 


Thr 


Ala 


Trp 


Val 


Gly 


Leu 


Leu 


Ala 
65 


Ser 


Pro 


Val 


Phe 


Gly 


Ala 


Arg 


Pro 
80 


Val 


Thr 


He 


Gly 


Gly 


Leu 


Met 


Leu 
95 


Ser 


Ser 


Phe 


Phe 


Phe 


Ser 


Tyr 


Gly 
110 


He 


Val 


Val 


Tyr 


Thr 


Ala 


Thr 


Val 
125 


Thr 


He 


Thr 


Arg 


Gly 


Leu 


Ala 


Leu 
140 


Gly 


Leu 


He 


Leu 


Phe 


He 


Tyr 


Ala 
155 


Ala 


Leu 


Gin 


Gly 


Leu 


Asp 


Gly 


Cys 
170 


Leu 


Leu 


He 


He 


Leu 


Ala 


Cys 


Gly 
185 


Ser 


Leu 


Met 


Cys 


Pro 


Leu 


Pro 


Lys 
200 


Lys 


He 


Ala 


Tyr 


Ser 


He 


Tyr 


Asn 


Glu 


Lys 


Gly 





325 










330 


Ser 


Asp 


Leu 


Leu 


His 


Lys 


Tyr 




340 










345 


Leu 


Thr 


Val 


Arg 


Leu 


Ala 


Val 




355 










360 


Pro 


Val 


Leu 


Phe 


Phe 


Thr 


Val 




370 










375 


Lys 


Lys 


Thr 


Lys 


Phe 


Asn 


Leu 




385 










390 


He 


Leu 


Leu 


Val 


Val 


He 


Asn 




400 










405 


Met 


Lys 


Asp 


He 


Phe 


Gly 


Val 




415 










420 


Leu 


He 


Phe 


He 


Leu 


Pro 


Ser 




430 










435 


Gin 


Asp 


Gly 


Asp 


Lys 


Gly 


Thr 




445 










450 


Leu 


Gly 


Leu 


Gly 


Val 


Leu 


Phe 




460 










465 


He 


Tyr 


Asp 


Trp 


Ala 


Cys 


Ser 




475 










480 


Glv 


Glv 


Tra 

XT 


Glv 


TrD 


Val 


He 




10 










15 


Phe 


Leu 


Cys 




Glv 


Ser 


Pro 




25 










30 


Glu 


Tra 


Leu 


Asp 


Ala 


Phe 


Glv 




40 
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Val 


Gly 
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Ala 
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Ala 
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Asn 
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Arg 
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Asp 
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Asp 
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Lys 


Asn 


Leu 


Glu 


Glu 


Asn 


He 
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215 






Asn 


He 


Leu 


Asd Lvs 
230 


Ser 


Tyr Ser Ser 


Thr 


Leu 


Ala 


Asn Gly 
245 


Asp 


Trr> Lvs Gin 
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Thr 


Val Thr 
260 
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Val 


Ala 


Glu Gin 
275 
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D>io 
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nla 
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i T ie l. tiiy xjbll 


■file 
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425 
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neu 
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Ala Vjiy 
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fill \r len CQy 
\jl_y Abxi Oex. 
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Trp 




iyr asp 
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Trp 


1 iix bin ±111 
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Phe Cys 
470 


Val 
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Trp 


Asp Thr Cys 
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Ala 


Pro 


Thr Thr 
500 
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Gly 


Asn 


Ala 


Glu Met 
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Leu 


lie 


Thr 


Ala 


Glu 


Glu Gin 
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Ala Val 


Gly 


Ser 


He 


Glu 


lie 


Phe Arg 








35 








He Thr 


Leu 


Met 


He 


Leu 


Gly 


He Leu 
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Ala Cys 
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Pro 


Leu 


Met 


Pro 


Leu Cys 
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Asn Leu 


He 


Ser 


Gly 


Cys 


Leu 


Val His 








80 








Gin Asn 


Cys 


Thr 


Gin 


Ser 


Gin 


Glu Lys 
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Glu 


Glu 


Lys 


Cys 


Arg 


He 


235 










240 


Asp 


Ser 


Leu 


Leu 


His 


Lys 


250 










255 


Pro 


Glu 


Thr 


Tyr 


Lys 


Lys 


265 










270 


LVS 


Gin 


Leu 


Ala 


Lvs 


Arq 


280 










285 


Glv 


Glu 


Thr 


Val 


Ala 


Leu 


295 










300 


Phe 


He 


Ala 


He 


Leu 


Leu 


310 










315 


Leu 


Leu 


Met 


Glu 


Asp 


Val 


325 










330 


Glu 


Phe 


He 


Met 


Pro 


lieu 


140 










345 


v ax 


Glv 


Lys 


Leu 


Leu 


Leu 


355 










360 


Asn 


Thr 


Leu 




Leu 




■j / \j 










375 


Al a 
ax a 




Cys 


Ala 


He 


Pro 


IRS 












Leu 


Leu 


Ser 


Glv 


He 


Leu 


400 










405 


lr lie 


riu 


iyi 


Val 


Thr 


Thr 


415 










420 


His 


Ala 


Thrr 

Aj/l 


Glv 


He 


Leu 


410 










435 


Ti£*11 


Glv 


Pro 


Pro 


He 


Val 

vai 


445 










450 


Tyr 


Asp 


He 


Ala 


Phe 


Tyr 


460 










465 


Gly 


Phe 


He 


Leu 


Leu 


Leu 


475 










480 


Asn 


Lys 


Gin 


Leu 


Pro 


Lys 


490 










495 


Val 


Ala 


Ser 


Asn 


Val 




505 




- 








Phe 


Asn 


He 


Gin 


Lys 


Ser 


10 










15 


Pro 


Lvs 


Leu 


Arcr 


Lvs 


Glu 


25 










30 


Phe 


Ala 


Asp 


Gly 


Leu 


Asp 


40 










45 


Thr 


Ser 


Leu 


Phe 


Asn 


Gly 


55 










60 


He 


Gly 


Glu 


Met 


Ser 


Asp 
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75 


Thr 


Asn 


Thr 


Thr 


Asn 


Tyr 


85 










90 


Leu 


Asn 


Glu 


Asp 


Met 


Thr 
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95 



100 



105 



Leu Leu Thr Leu Tyr Tyr Val Gly lie Gly Val Ala Ala Leu lie 
110 115 120 

Phe Gly Tyr He Gin He Ser Leu Trp He He Thr Ala Ala Arg 
125 130 135 

Gin Thr Lys Arg He Arg Lys Gin Phe Phe His Ser Val Leu Ala 
140 145 150 

Gin Asp He Gly Trp Phe Asp Ser Cys Asp He Gly Glu Leu Asn 
155 160 165 

Thr Arg Met Thr Asp Asp He Asp Lys He Ser Asp Gly He Gly 
170 175 180 

Asp Lys He Ala Leu Leu Phe Gin Asn Met Ser Thr Phe Ser He 
185 190 195 

Gly Leu Ala Val Gly Leu Val Lys Gly Trp Lys Leu Thr Leu Val 
200 205 210 

Thr Leu Ser Thr Ser Pro Leu He Met Ala Ser Ala Ala Ala Cys 
215 220 225 

Ser Arg Met Val He Ser Leu Thr Ser Lys Glu Leu Ser Ala Tyr 
230 235 240 

Ser Lys Ala Gly Ala Val Ala Glu Glu Val Leu Ser Ser He Arg 
245 250 255 

Thr Val He Ala Phe Arg Ala Gin Glu Lys Glu Leu Gin Arg Tyr 
260 265 270 

Thr Gin Asn Leu Lys Asp Ala Lys Asp Phe Gly He Lys Arg Thr 
275 280 285 

He Ala Ser Lys Val Ser Leu Gly Ala Val Tyr Phe Phe Met Asn 
290 295 300 

Gly Thr Tyr Gly Leu Ala Phe Trp Tyr Gly Thr Ser Leu He Leu 
305 310 315 

Asn Gly Glu Pro Gly Tyr Thr He Gly Thr Val Leu Ala Val Phe 
320 325 330 

Phe Ser Val lie His Ser Ser Tyr Cys He Gly Ala Ala Val Pro 
335 340 345 

His Phe Glu Thr Phe Ala He Ala Arg Gly Ala Ala Phe His He 
350 355 360 

Phe Gin Val He Asp Lys Lys Pro Ser He Gly Asn Phe Ser Thr 
365 370 375 

Ala Gly Tyr Lys Pro Glu Ser He Glu Gly Thr Val Glu Phe Lys 
380 385 390 

Asn Val Ser Phe Asn Tyr Pro Ser Arg Pro Ser He Lys He Leu 
395 400 405 

Lys Gly Leu Asn Leu Gly He Lys Ser Gly Glu Thr Val Ala Leu 
410 415 420 

Val Gly Leu Asn Gly Ser Gly Lys Ser Thr Val Val Gin Leu Leu 
425 430 435 

Gin Arg Leu Tyr Asp Pro Asp Asp Gly Phe He Met Val Asp Glu 
440 445 450 

Asn Asp He Arg Ala Leu Asn Val Arg His Tyr Arg Asp His He 
455 460 465 

Gly Val Val Ser Gin Glu Pro Val Leu Phe Gly Thr Thr He Ser 
470 . 475 480 

Asn Asn He Lys Tyr Gly Arg Asp Asp Val Thr Asp Glu Glu Met 
485 490 495 

Glu Arg Ala Ala Arg Glu Ala Asn Ala Tyr Asp Phe He Met Glu 
500 505 510 

Phe Pro Asn Lys Phe Asn Thr Leu Val Gly Glu Lys Gly Ala Gin 
515 520 525 

Met Ser Gly Gly Gin Lys Gin Arg He Ala He Ala Arg Ala Leu 
530 535 540 

Val Arg Asn Pro Lys He Leu He Leu Asp Glu Ala Thr Ser Ala 
545 550 555 

Leu Asp Ser Glu Ser Lys Ser Ala Val Gin Ala Ala Leu Glu Lys 



560 



565 



570 
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Ala 


Ser 


Lys 


Gly 


Arg 
575 


Thr 


Thr 


He 


Val 


Val Ala 
580 


His 


Arg Leu 


Ser 
585 


Thr 


lie 


Arg 


Ser 


Ala 
590 


Asp 


Leu 


He 


Val 


Thr Leu 
595 


Lys 


Asp Gly 


Met 
600 


Leu 


Ala 


Glu 


Lys 


Gly 
605 


Ala 


His 


Ala 


Glu 


Leu Met 
610 


Ala 


Lys Arg 


Gly 
615 


Leu 


Tyr 


Tyr 


Ser 


Leu 
620 


Val 


Met 


Ser 


Gin 


Asp He 
625 


Lys 


Lys Ala 


Asp 
630 


Glu 


Gin 


Met 


Glu 


Ser 
635 


Met 


Thr 


Tyr 


Ser 


Thr Glu 
640 


Arg 


Lys Thr 


Asn 
645 


Ser 


Leu 


Pro 


Leu 


His 
650 


Ser 


Val 


Lys 


Ser 


He Lys 
655 


Ser 


Asp Phe 


He 
660 


Astd 


Lvs 


Ala 


Glu 


Glu 
665 


Ser 


Thr 


Gin 


Ser 


Lys Glu 
670 


He 


Ser Leu 


Pro 
675 


Glu 


Val 


Ser 


Leu 


Leu 
680 


Lys 


He 


Leu 


Lys 


Leu Asn 
685 


Lys 


Pro Glu 


Trt> 
690 


Pro 


Phe 


Val 


Val 


Leu 
695 


Glv 


Thr 


Leu 


Ala 


Ser Val 
700 


Leu 


Asn Gly 


Thr 
705 


Val 


His 


Pro 


Val 


Phe 
710 


Ser 


He 


He 


Phe 


Ala Lys 
715 


He 


He Thr 


Met 
720 


Phe 


Glv 


Asn 


Asn 


Asp 
725 


Lys 


Thr 


Thr 


Leu 


Lys Hi s 
730 


Asp 


Ala Glu 


He 
735 


Tvr 


Ser 


Met 


He 


Phe 
740 


Val 


He 


Leu 


Glv 


Val He 
745 


Cys 


Phe Val 


Ser 
750 


1\rr 


Phe 


Met 


Gin 


Asp 
755 


He 


Ala 




Phe 


Asp Glu 
760 


Lys 


Glu Asn 


Ser 
765 


Thr 


Glv 


Glv 


Leu 


Thr 
770 


Thr 


He 


Leu 


Ala 


Tie AST5 
775 


He 


Ala Gin 


He 
780 


Gin 


Glv 


Ala 


Thr 


Glv 
785 


Ser 


Arg 


He 


Glv 


Val Leu 
790 


Thr 


Gin Asn 


Ala 
795 


Thr 


Asn 


Met 


Glv 


Leu 
800 


Ser 


Val 


He 


He 


Ser Phe 
805 


He 


Tvr Glv 


Tro 
810 


Glu 


Met 


Thr 


Phe 


Leu 
815 


He 


Leu 


Ser 


He 


Ala Pro 
820 


Val 


Leu Ala 


Val 
825 


Thr 


Glv 


Met 


He 


Glu 
830 


Thr 


Ala 


Ala 


Met 


Thr Gly 
835 


Phe 


Ala Asn 


Lys 
840 


Asp 


Lys 


Gin 


Glu 


Leu 
845 


Lys 


His 


Ala 


Gly 


Lys Xle 
850 


Ala 


Thr Glu 


Ala 
855 


Leu 


Glu 


Asn 


He 


A "ITT 


Thr 


He 


Val 


Ser 


lieu Thr 


Arg, Glu Lys 


Ala 










860 










865 






870 


Phe 


Glu 


Gin 


Met 


Tvr 
875 


Glu 


Glu 


Met 


Leu 


Gin Thr 
880 


Gin 


His Arg 


Asn 
885 


Thr 


Ser 


Lys 


Lys 


Ala 
890 


Gin 


He 


He 


Glv 


Ser Cys 
895 


Tyr 


Ala Phe 


Ser 
900 


His 


Ala 


Phe 


He 


Tvr 
905 


Phe 


Ala 


Tvr 


Ala 


Ala Gly 
910 


Phe 


Arg Phe 


Gly 
915 


Ala 


Tyr 


Leu 


He 


Gin 
920 


Ala 


Gly 


Arg 


Met 


Thr Pro 
925 


Glu 


Gly Met 


Phe 
930 


lie 


Val 


Phe 


Thr 


Ala 
935 


He 


Ala 


Tyr 


Gly 


Ala Met 
940 


Ala 


He Gly 


Glu 
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Leu 


Val 


Leu 


Ala 
950 


Pro 


Glu 
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Ser 


Lys Ala 
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Ala 
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Phe 


Ala 
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Leu 


Glu 
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Lys Pro 
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Ser 
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Arg 
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Glu 


Gly 
980 
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Lys 
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Asp 


Thr Cys 
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Glu 


Gly Asn 


Leu 
990 
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Phe 


Arg 


Glu 


Val 


Ser 


Phe 
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Tyr 


Pro Cys 


Arg 
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Val 
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Phe 


He 
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Arg 


Gly 
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Ser 
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Ser 


He Glu 


Arg 


Gly Lys Thr 








1010 








1015 




1020 


Val 
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Ser 


Gly 
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Ser 
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Val Gin 


Gly 


Gin Val 
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1040 




1045 


1050 


Phe 


Asp 


Gly 


Val Asp 


Ala Lys Glu 


Leu Asn 


Val Gin Trp Leu Arg 








1055 




1060 


1065 


Ser 


Gin 


He 


Ala He 


Val Pro Gin 


Glu Pro 


Val Leu Phe Asn Cys 








1070 




1075 


1080 


Ser 


He 


Ala 


Glu Asn 


He Ala Tyr 


Gly Asp 


Asn Ser Arg Val Val 








1085 




1090 


1095 


Pro 


Leu 


Asp 


Glu He 


Lys Glu Ala 


Ala Asn 


Ala Ala Asn He His 








1100 




1105 


1110 


Ser 


Phe 


He 


Glu Gly 


Leu Pro Glu 


Lys Tyr 


Asn Thr Gin Val Glv 








1115 




1120 


1125 


Leu 


Lys 


Glv 


Ala Gin 


Leu Ser Gly 


Gly Gin 


Lvs m n At~ci Tien Ala 








1130 




1135 


1140 


He 


Ala 


Arg 


Ala Leu 


Leu Gin Lvs 


Pro Lys 


Tlf* TiMi TiPii L^u Af3T> 

JUJL^ JJCU XJCU JJCU no^/ 
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T iPH A C3T> A QTl 
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filn T.vcj Val Val filn 

V7JL U XJjf O VCLX VOX V3JL AJL 








1160 




1165 


1170 


His 


Ala 


Leu 
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Bl a T±~r~rt fVir" 
/1LC1 ^TuL y X XXX. 




T^Viv rSro T.on T7a 1 Vial 

xi jjl Lyb jucu vdx vai 
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llOJ 


Thr 


His 


Arg 


Leu Ser 


Ala He Gin 


Asn Ala 


Asp Leu He Val Val 








1190 
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1200 


Leu 
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Asn 


Gly Lys 


He Lys Glu 


Gin Gly 


Thr His Gin Glu "Leu 
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Arg 
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Lys Leu 


Val Asn Ala Gin Ser 
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Val 


Gin 
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185 










190 










195 


Glu 


Leu 


Asd 


Lys 


Ala 


Phe 


Ser 


Val 


Ser 


Val 


Leu 


Ser 


Val 


Ser 


Ser 
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205 










210 


Gly 


Ser 


Leu 


Gly 


Ala 


His 


He 


Asn 


Ala 


Thr 


Leu 


Thr 


Val 


Leu 


Ala 








215 










220 










225 


Ser 


Asp 


Asp 


Pro 


Tvr 


Gly 


He 


Phe 


He 


Phe 


Ser 


Glu 


Lys 


Asn 


Arg 








230 










235 










240 


Pro 


Val 


Lvs 


Val 


Glu 


Glu 


Ala 


Thr 


Gin 


Asn 


He 


Thr 


Leu 


Ser 


He 








245 










250 










255 


lie 


Arg 


Leu 


Lys 


Glv 
260 


Leu 


Met 


Gly 


LVS 


Val 
265 


Leu 


Val 


Ser 


Tyr 


Ala 
270 


Thr 


Leu 


Asp 


Asp 


Met 


Glu 


Lys 


Pro 


Pro 


Tvr 


Phe 


Pro 


Pro 


Asn 


Leu 








275 










280 










285 


Ala 


At"cr 


Ala 


Thr 


Gin 
290 


Glv 


Arg 


Asp 


Tvr 


He 
295 


Pro 


Ala 


Ser 


Glv 


Phe 
300 


Ala 
nia 


JUCU 


JTAAC 


Glv 


Ala 
305 


Asn 


Gin 


Ser 


Glu 


Ala 
310 


Thr 


He 


Ala 


He 


Ser 
315 


Tl e> 








Asp 
320 


Glu 


Pro 


Glu 


Ar ci 


Ser 
325 


Glu 


Ser 


Val 


Phe 


He 
330 


Gl n 


Leu 


Leu 


A on 
noli 


OCl 

335 


Thr 


Leu 


Val 


Ala 


Lys 
340 


Val 


Gin 


Ser 


Arg 


Ser 
345 
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lie 


XT J. \J 


A on 
noli 
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ir JL u 

350 


niy 
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Lys 
355 
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Thr 


lie 


Ala 
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He 
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Asp 


Ala 
370 


Phe 


Glv 
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O CI 
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380 
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Ala 
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Ile 


noli 


Val 


Thr 
395 
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Glv 


Glv 


Ala 
400 


Phe 


Ala 


Asp 


Val 
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405 


Val 


Lys 


xr lit; 




Ala 
410 


Val 
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lie 


Thr 


Ala 
415 


He 


Ala 
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Glu 
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450 
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Cot* 

Del 


Phe 
455 
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Val 


Gin 


Leu 


Met 
460 
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Glu 


Thr 


Thr 


Glv 
465 


Gly 


Ala 




Leu 


Gly 
470 
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Thr 
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475 


Val 


He 


He 


He 


Glu 
480 


AJLa 




Asp 


Asp 


JrlU 

485 
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Gly 


Leu 


Phe 


Glv 
oiy 
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He 


Thr 


Lys 
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Leu 


Ile 


Val 


Glu 


Glu 
500 
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505 


Val 


Lys 


Val 
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Leu 
510 


Pro 


He 


He 


A T"f*T 
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Thr 


Leu 
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Thr 
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Ala 


Thr 
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Gin 


Leu 


Ala 
535 


Thr 
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Leu 


Arcr 
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Val 


Val 
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Glv 


Asn 
545 


Val 


Thr 


Phe 


Ala 


Pro 
550 


Glv 


Glu 


Thr 


He 


Gin 
555 


Thr 


Leu 


Leu 


Leu 


Glu 
560 


Val 


Leu 


Ala 


Asp 


Asp 
565 


Val 


Pro 


Glu 


He 


Glu 
570 


Glu 


Val 


He 


Gin 


Val 
575 


Gin 


Leu 


Thr 


Asp 


Ala 
580 


Ser 


Gly 


Gly 


Gly 


Thr 
585 


lie 


Glv 


Leu 


Asp 


Ara 
590 


He 


Ala 


Asn 


He 


He 
595 


He 


Pro 


Ala 


Asn 


Asp 
600 


Asp 


Pro 


Tvr 


Gly 


Thr 
605 


Val 


Ala 


Phe 


Ala 


Gin 
610 


Met 


Val 


Tyr 


Arg 


Val 
615 


Gin 


Glu 


Pro 


Leu 


Glu 
620 


Arg 


Ser 


Ser 


Cys 


Ala 
625 


Asn 


He 


Thr 


Val 


Arg 
630 


Arg 


Ser 


Gly 


Gly 


His 
635 


Phe 


Gly 


Arg 


Leu 


Leu 
640 


Leu 


Phe 


Tyr 


Ser 


Thr 
645 


Ser 


Asp 


He 


Asp 


Val 
650 


Val 


Ala 


Leu 


Ala 


Met 
655 


Glu 


Glu 


Gly 


Gin 


Asp 
660 
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Leu 


Leu 


Ser Tyr 


Tvr 


Glu 


Ser 


Pro lie 


Gin 


Gly Val Pro Asp 


Pro 








665 








670 






675 


Leu 


Tn> 


Arg Thr 


Trp 
680 


Met 


Asn 


Val Ser 


Ala 
685 


Val Gly Glu 


Pro 


Leu 
690 


Tvx 


Thr 


Cys Ala 


Thr 
695 


Leu 


Cys 


Leu Lys 


Glu 
700 


Gin Ala Cys 


Ser 


Ala 
705 


Phe 


Ser 


Phe Phe 


Ser 
710 


Ala 


Ser 


Glu Gly 


Pro 
715 


Gin Cys Phe 


Trp 


Met 
720 


Thr 


Ser 


Trp lie 


Ser 
725 


Pro 


Ala 


Val Asn 


Asn 
730 


Ser Asp Phe 


Trp 


Thr 
735 


Tyr 


Arg 


Lys Asn 


Met 
740 


Thr 


Arg 


Val Ala 


Ser 
745 


Leu Leu Val 


Val 


Arg 
750 


Leu 


Trp 


Leu Gly 


Val 
755 


Thr 


Met 


Ser Leu 
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Met 


Pro 


His 


Arg 


Lys 


Glu 


Arg 


Pro 


Ser 


Gly 


Ser Ser Leu His 


Thr 


1 








5 










10 




15 


HIS 


Gly 


Ser 


Thr 


Gly 


Thr 


Ala 


Glu Gly 


Gly 


Asn Met Ser Arg 


Leu 










20 










25 




J 0 


Ser 


Leu 


Thr 


Arg 


Ser 


Pro 


Val 


Ser 


Pro 


Leu 


A-Lcl Ala bin biy 


■Lie 










35 










40 




45 


Pro 


Leu 


Pro 


Ala 


Gin 


Leu 


Thr 


Lys 


Ser 


Asn 


Ala Pro Val His 


He 










50 










55 




60 


Asp 


Val 


Gly 


Gly 


His 


Met 


Tyr 


Thr 


Ser 


Ser 


Leu Ala Thr Leu 


Thr 










65 










70 




75 


Lys 


Tyr 


Pro 


Asp 


Ser 


Arg 


He 


Ser Arg 


Leu 


Phe Asn Gly Thr 


Glu 










80 










85 




90 


Pro 


He 


Val 


Leu 


Asp 


Ser 


Leu 


Lys 


Gin 


His 


Tyr Phe He Asp 


Arg 










95 










100 




105 


Asp 


Gly 


Glu 


lie 


Phe 


Arg 


Tyr 


Val 


Leu 


Ser 


Phe Leu Arg Thr 


Ser 










110 










115 




120 


Lys 


Leu 


Leu 


Leu 


Pro 


Asp 


Asp 


Phe 


Lys 


Asp 


Phe Ser Leu Leu 


Tyr 










125 










130 




135 


Glu 


Glu 


Ala 


Arg 


Tyr 


Tyr 


Gin 


Leu 


Gin 


Pro 


Met Val Arg Glu 


Leu 










140 










145 




150 


Glu 


Arg 


Trp 


Gin 


Gin 


Glu 


Gin 


Glu 


Gin 


Arg 


Arg Arg Ser Arg 


Ala 










155 










160 




165 


Cys 


Asp 


Cys 


Leu 


Val 


Val 


Arg 


Val 


Thr 


Pro 


Asp Leu Gly Glu 


Arg 










170 










175 




180 


He 


Ala 


Leu 


Ser 


Gly 


Glu 


Lys 


Ala 


Leu 


He 


Glu Glu Val Phe 


Pro 










185 










190 




195 


Glu 


Thr 


Gly 


Asp 


Val 


Met 


Cys 


Asn 


Ser 


Val 


Asn Ala Gly Trp 


Asn 










200 










205 




210 


Gin 


Asp 


Pro 


Thr 


His 


Val 


He 


Arg 


Phe 


Pro 


Leu Asn Gly Tyr 


Cys 










215 










220 




225 


Arg 


Leu 


Asn 


Ser 


Val 


Gin 


Val 


Leu 


Glu 


Arg 


Leu Phe Gin Arg 


Gly 










230 










235 




240 


Phe 


Ser 


Val 


Ala 


Ala 


Ser 


Cys 


Gly Gly 


Gly 


Val Asp Ser Ser 


Gin 










245 










250 




255 


Phe 


Ser 


Glu 


Tyr 


Val 


Leu 


Cys 


Arg Glu 


Glu 


Arg Arg Pro Gin 


Pro 










260 










265 




270 


Thr 


Pro 


Thr 


Ala 


Val 


Arg 


He 


Lys 


Gin 


Glu 


Pro Leu Asp 





275 280 
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Met 


Phe 


Arg 


Arg 


Ser 


Leu 


Asn 


Afg Phe Cys Ala Gly Glu Glu Lys 


1 








5 






10 


15 


Arg 


Val 


Gly 


Thr 


Arg 


Thr 


Val 


Phe Val Gly Asn His Pro Val Ser 










20 






25 


30 


Glu 


Thr 


Glu 


Ala 


Tyr 


He 


Ala 


Gin Arg Phe Cys Asp Asn Arg He 










35 






40 


45 


Val 


Ser 


Ser 


Lys 


Tyr 


Thr. 


Leu 


Trp Asn Phe Leu Pro 


Lys Asn Leu 










50 






55 


60 


Phe 


Glu 


Gin 


Phe 


Arg 


Arg 


He 


Ala Asn Phe Tyr Phe 


Leu He He 










65 






70 


75 


Phe 


Leu 


Val 


Gin 


Val 


Thr 


Val 


Asp Thr Pro Thr Ser 


Pro Val Thr 










80 






85 


90 


Ser 


Gly 


Leu 


Pro 


Leu 


Phe 


Phe 


Val He Thr Val Thr 


Ala He Lys 










95 






100 


105 


Gin 


Gly 


Tyr 


Glu 


Asp 


Cys 


Leu 


Arg His Arg Ala Asp 


Asn Glu Val 










110 






115 


120 


Asn 


Lys 


Ser 


Thr 


Val 


Tyr 


He 


He Glu Asn Ala Lys 


Arg Val' Arg 










125 






130 


135 


Iiys 


Glu 


Ser 


Glu 


Lys 


lie 


Lys 


Val Gly Asp Val Val 


Glu Val Gin 










140 






145 


150 


Ala 


Asp 


Glu 


Thr 


Phe 


Pro 


Cys 


Asp Leu He Leu Leu 


Ser Ser Cys 










155 






160 


.165 


Thr 


Thr 


Asp 


Gly 


Thr 


Cys 


Tyr 


Val Thr Thr Ala Ser 


Leu Asp Gly 










170 






175 


180 


Glu 


Ser 


Asn 


Cys 


Lys 


Thr 


His 


Tyr Ala Val Arg Asp 


Thr He Ala 










185 






190 


195 


Leu 


Cys 


Thr 


Ala 


Glu 


Ser 


He 


Asp Thr Leu Arg Ala 


Ala He Glu 










200 






205 


210 


Cys 


Glu 


Gin 


Pro 


Gin 


Pro 


Asp 


Leu Tyr Lys Phe Val 


Gly Arg He 










215 






220 


225 


Asn 


lie 


Tyr 


Ser 


Asn 


Ser 


Leu 


Glu Ala Val Ala Arg 


Ser Leu Gly 










230 






235 


240 


Pro 


Glu 


Asn 


Leu 


Leu 


Leu 


Lys 


Gly Ala Thr Leu Lys 


Asn Thr Glu 
















250 


255 


Lys 


lie 


Tyr 


Gly 


Val 


Ala 


Val 


Tyr Thr Gly Met Glu 


Thr Lys Met 
















265 


270 


Ala 


Leu 


Asn 


Tyr 


Gin 


Gly 


Lys 


Ser Gin Lys Arg Ser 


Ala Val Glu 










275 






280 


285 


Lys 


Ser 


lie 


Asn 


Ala 


Phe 


Leu 


He Val Tyr Leu Phe 


He Leu Leu 










290 






295 


300 


Thr 


Lys 


Ala 


Ala 


Val 


Cys 


Thr 


Thr Leu Lys Tyr Val 


Trp Gin Ser 










305 






310 


315 


Thr 


Pro 


Tyr 


Asn 


Asp 


Glu 


Pro 


Trp Tyr Asn Gin Lys 


Thr Gin Lys 










320 






325 


330 


Glu 


Arg 


Glu 


Thr 


Leu 


Lys 


Val 


Leu Lys Met Phe Thr 


Asp Phe Leu 










335 






340 


345 


Ser 


Phe 


Met 


Val 


Leu 


Phe 


Asn 


Phe He He Pro Val 


Ser Met Tyr 










350 






355 


360 


Val 


Tin- 


Val 


Glu 


Met 


Gin 


Lys 


Phe Leu Gly Ser Phe 


Phe He Ser 










365 






370 


375 


Trp 


Asp 


Lys 


Asp 


Phe 


Tyr 


Asp 


Glu Glu He Asn Glu 


Gly Ala Leu 










380 






385 


390 


Val 


Asn 


Thr 


Ser 


Asp 


Leu 


Asn 


Glu Glu Leu Gly Gin 


Val Asp Tyr 
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395 










400 










405 


Val 


Phe 


Thr 


Asp 


Lys 
410 


Thr 


Gly 


Thr 


Leu 


Thr 
415 


Glu 


Asn 


Ser 


Met 


Glu 
420 


Phe 


He 


Glu 


Cys 


Cys 
425 


He 


Asp 


Gly 


His 


Lys 
430 


Tyr 


Lys 


Gly 


Val 


Thr 
435 


Gin 


Glu 


Val 


Asp 


Gly 
440 


Leu 


Ser 


Gin 


Thr 


Asp 
445 


Gly 


Thr 


L U 


Thr 


Tyr 
450 


Phe 


Asp 


Lys 


Val 


Asp 
455 


Lys 


Asn 


Arg 


Glu 


Glu 
460 


Leu 


Phe 


Leu 


Arg 


Ala 
465 


Leu 


Cys 


Leu 


Cys 


His 
470 


Thr 


Val 


Glu 


He 


Lys 
475 


Thr 


Asn 


Asp 


Ala 


Val 
480 


Asp 


Gly 


Ala 


Thr 


Glu 
485 


Ser- 


Ala 


Glu 


Leu 


Thr 
490 


Tyr 


He 


Ser 


Ser 


Ser 
495 


Pro 


Asp 


Glu 


He 


Ala 
500 


Leu 


Val 


Lys 


Gly 


Ala 
505 


Lys 


Arg 


Tyr 


Gly 


Phe 
510 


Thr 


Phe 


Leu 


Gly 


Asn 
515 


Arg 


Asn 


Gly 


Tyr 


Met 
520 


Arg 


Val 


Glu 


Asn 


Gin 
525 


Arg 


Lvs 


Glu 


He 


Glu 
530 


Glu 


Tvr 


Glu 


Leu 


Leu 
535 


His 


Thr 


Leu 


Asn 


Phe 
540 


Asp 


Ala 


Val 


Arg 


Arg 
545 


Arg 


Met 


Ser 


Val 


He 
550 


Val 


Lys 


Thr 


Gin 


Glu 
555 


Gly 


Asp 


He 


Leu 


Leu 
560 


Phe 


Cvs 


Lvs 


Gly 


Ala 
565 


Asp 


Ser 


Ala 


Val 


Phe 
570 


Pro 


Ara 


Val 


Gin 


Asn 
575 


His 


Glu 


He 


Glu 


Leu 
580 


Thr 


Lys 


Val 


His 


Val 
585 


Glu 


Arg 


Asn 


Ala 


Met 
590 


Asp 


Gly 


Tvr 


Arg 


Thr 
595 


Leu 


Cys 


Val 


Ala 


Phe 
600 


Lvs 


Glu 


He 


Ala 


Pro 
605 


Asp 


ASP 


Tvr 


Glu 


Arg 
610 


He 


Asn 


Arg 


Gin 


Leu 
615 


He 


Glu 


Ala 


Lys 


Met 
620 


Ala 


Leu 


Gin 


Asp 


Arg 
625 


Glu 


Glu 


Lys 


Met 


Glu 
630 


Lvs 


Val 


Phe 


Asp 


Asp 
635 


He 


Glu 


Thr 


Asn 


Met 
640 


Asn 


Leu 


He 


Glv 


Ala 
645 


Thr 


Ala 


Val 


Glu 


Asp 
650 


Lys 


Leu 


Gin 


Asp 


Gin 
655 


Ala 


Ala 


Glu 


Thr 


He 
660 


Glu 


Ala 


Leu 


His 


Ala 
665 


Ala 


Glv 


Leu 


Lvs 


Val 
670 


Trp 


Val 


Leu 


Thr 


Gly 
675 


Asp 


Lvs 


Met 


Glu 


Thr 
680 


Ala 


Lvs 


Ser 


Thr 


Cys 
685 


Tvr 


Ala 


Cys 


Arg 


Leu 
690 


Phe 


Gin 


Thr 


Asn 


Thr 
695 


Glu 


Leu 


Leu 


Glu 


Leu 
700 


Thr 


Thr 


Lys 


Thr 


He 
705 


Glu 


Glu 


•Ser 


Glu 


Arcr 
710 


Lys 


Glu 


Asp 


Arcr 


Leu 
715 


His 


Glu 


Leu 


Leu 


He 
720 


Glu 


Tyr 


Arg 


Lys 


Lys 
725 


Leu 


Leu 


His 


Glu 


Phe 
730 


Pro 


Lys 


Ser 


Thr 


Arg 
735 


Ser 


Phe 


Lys 


Lys 


Ala 
740 


Trp 


Thr 


Glu 


His 


Gin 
745 


Glu 


Tyr 


Gly 


Leu 


He 
750 


He 


Asp 


Gly 


Ser 


Thr 
755 


Leu 


Ser 


Leu 


He 


Leu 
760 


Asn 


Ser 


Ser 


Gin 


Asp 
765 


Ser 


Ser 


Ser 


Asn 


Asn 
770 


Tyr 


Lys 


Ser 


He 


Phe 
775 


Leu 


Gin 


He 


Cys 


Met 
780 


Lys 


Cys 


Thr 


Ala 


Val 
785 


Leu 


Cys 


Cys 


Arg 


Met 
790 


Ala 


Pro 


Leu 


Gin 


Lys 
795 


Ala 


Gin 


He 


Val 


Arg 
800 


Met 


Val 


Lys 


Asn 


Leu 
805 


Lys 


Gly 


Ser 


Pro 


He 
810 


Thr 


Leu 


Ser 


He 


Gly 
815 


Asp 


Gly 


Ala 


Asn 


Asp 
820 


Val 


Ser 


Met 


He 


Leu 
825 


Glu 


Ser 


His 


Val 


Gly 
830 


He 


Gly 


He 


Lys 


Gly 
835 


Lys 


Glu 


Gly 


Arg 


Gin 
840 


Ala 


Ala 


Arg 


Asn 


Ser 
845 


Asp 


Tyr 


Ser 


Val 


Pro 
850 


Lys 


Phe 


Lys 


His 


Leu 
855 


Lys 


Lys 


Leu 


Leu 


Leu 
860 


Ala 


His 


Gly 


His 


Leu 
865 


Tyr 


Tyr 


Val 


Arg 


He 
870 
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Ala 


His 


Leu 


Val Gin 


Tyr 


Phe 


Phe 


Tyr Lys 


Asn Leu Cys 


Phe He 








875 








880 




885 


Leu 


Pro 


Gin 


Phe Leu 


Tyr 


Gin 


Phe 


Phe Cys 


Gly Phe Ser 


Gin Gin 








890 








895 




900 


Pro 


Leu 


Tyr 


Asp Ala 


Ala 


Tyr 


Leu 


Thr Met 


Tyr Asn He 


Cys Phe 






905 








910 




915 


Thr 


Ser 


Leu 


Pro He 


Leu 


Ala 


Tyr 


Ser Leu 


Leu Glu Gin 


His He 








920 








925 




930 


Asn 


lie 


Asp 


Thr Leu 


Thr 


Ser 


Asp 


Pro Arg 


Leu Tyr Met 


Lys He 






935 








940 




945 


Ser 


Gly 


Asn 


Ala Met 


Leu 


Gin 


Leu 


Gly Pro 


Phe Leu Tyr 


Trp Thr 






950 








955 




960 


Phe 


Leu 


Ala 


Ala Phe 


Glu 


Gly 


Thr 


Val Phe 


Phe Phe Gly 


Thr Tyr 








965 








970 




975 


Phe 


Leu 


Phe 


Gin Thr 


Ala 


Ser 


Leu 


Glu Glu 


Asn Gly Lys 


Val Tyr 








980 








985 




990 


Gly 


Asn 


Trp 


Thr Phe 


Gly 


Thr 


He 


Val Phe 


Thr Val Leu 


Val Phe 




995 








1000 




1005 


Thr 


Val 


Thr 


Leu Lys 


Leu 


Ala 


Leu 


Asp Thr 


Arg Phe Trp 


Thr Trp 








1010 








1015 




1020 


He 


Asn 


His 


Phe Val 


He 


Trp 


Gly 


Ser Leu 


Ala Phe Tyr 


Val Phe 








1025 








1030 




1035 


Phe 


Ser 


Phe 


Phe Trp 


Gly 


Gly 


He 


He Trp 


Pro Phe Leu 


Lys Gin 








1040 








1045 




1050 


Gin 


Arg 


Met 


Tyr Phe 


Val 


Phe 


Ala 


Gin Met 


Leu Ser Ser 


Val Ser 






1055 








1060 




1065 


Thr 


Trp 


Leu 


Ala He 


He 


Leu 


Leu 


He Phe 


He Ser Leu 


Phe Pro 






1070 








1075 




1080 


Glu 


He 


Leu 


Leu He 


Val 


Leu 


Lys 


Asn Val 


Arg Arg Arg 


Ser Ala 








1085 








1090 




1095 


Arg 


Arg 


Asn 


Leu Ser 


Cys 


Arg 


Arg 


Ala Ser 


Asp Ser Leu 


Ser Ala 








1100 








1105 




1110 


Arg 


Pro 


Ser 


Val Arg 


Pro 


Leu 


Leu 


Leu Arg 


Thr Phe Ser 


Asp Glu 






1115 








1120 




1125 


Ser 


Asn 


Val 


Leu 
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Met 


Ser 


Arg 


Lys 


Ala 


Ser 


Glu 


Asn 


Val 


Glu 


Tyr Thr Leu 


Arg 


Ser 


1 








5 










10 






15 


Leu 


Ser 


Ser 


Leu 


Met 


Gly 


Glu 


Arg 


Arg 


Arg 


Lys Gin Pro 


Glu 


Pro 










20 










25 






30 


Asp 


Ala 


Ala 


Ser 


Ala 


Ala 


Gly 


Glu 


Cys 


Ser 


Leu Leu Ala 


Ala 


Ala 








35 










40 






45 


Glu 


Ser 


Ser 


Thr 


Ser 


Leu 


Gin 


Ser 


Ala 


Gly 


Ala Gly Gly 


Gly 


Gly 










50 










55 






60 


Val 


Gly 


Asp 


Leu 


Glu 


Arg 


Ala 


Ala 


Arg 


Arg 


Gin Phe Gin 


Gin 


Asp 










65 










70 






75 


Glu 


Thr 


Pro 


Ala 


Phe 


Val 


Tyr 


Val 


Val 


Ala 


Val Phe Ser 


Ala 


Leu 










80 










85 






90 


Gly 


Gly 


Phe 


Leu 


Phe 


Gly 


Tyr 


Asp 


Thr 


Gly 


Val Val Ser 


Gly 


Ala 










95 










100 






105 


Met 


Leu 


Leu 


Leu 


Lys 


Arg 


Gin 


Leu 


Ser 


Leu 


Asp Ala Leu 


Trp 


Gin 



110 115 120 
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Glu Leu Leu Val Ser Ser Thr Val Gly Ala Ala Ala Val Ser Ala 

125 130 135 

Leu Ala Gly Gly Ala Leu Asn Gly Val Phe Gly Arg Arg Ala Ala 

140 145 150 

lie Leu Leu Ala Ser Ala Leu Phe Thr Ala Gly Ser Ala Val Leu 

155 160 165 

Ala Ala Ala Asn Asn Lys Glu Thr Leu Leu Ala Gly Arg Leu Val 

170 175 180 

Val Gly Leu Gly lie Gly lie Ala Ser Met Thr Val Pro Val Tyr 

185 190 195 

lie Ala Glu Val Ser Pro Pro Asn Leu Arg Gly Arg Leu Val Thr 

200 205 210 

lie Asn Thr Leu Phe lie Thr Gly Gly Gin Phe Phe Ala Ser Val 

215 220 225 

Val Asp Gly Ala Phe Ser Tyr Leu Gin Lys Asp Gly Trp Arg Tyr 

230 235 240 

Met Leu Gly Leu Ala Val Val Pro Ala Val He Gin Phe Phe Gly 

245 250 255 

Phe Leu Phe Leu Pro Glu Ser Pro Arg Trp Leu He Gin Lys Gly 

260 265 270 

Gin Thr Gin Lys Ala Arg Arg He Leu Ser Gin Met Arg Gly Asn 

275 280 285 

Gin Thr He Asp Glu Glu Tyr Asp Ser He Lys Asn Asn He Glu 

290 295 300 

Glu Glu Glu Lys Glu Val Gly Ser Ala Gly Pro Val He Cys Arg 

305 310 315 

Met Leu Ser Tyr Pro Gin Thr Arg Arg Ala Leu He Val Gly Cys 

320 325 330 

Gly Leu Gin Met Phe Gin Gin Leu Ser Gly He Asn Thr He Met 

335 340 345 

Tyr Tyr Ser Ala Thr He Leu Gin Met Ser Gly Val Glu Asp Asp 

350 355 360 

Arg Leu Ala He Trp Leu Ala Ser Val Thr Ala Phe Thr Asn Phe 

365 370 375 

He Phe Thr Leu Val Gly Val Trp Leu Val Glu Lys Val Gly Arg 

380 385 390 

Arg Lys Leu Thr Phe Gly Ser Leu Ala Gly Thr Thr Val Ala Leu 

395 400 405 

He He Leu Ala Leu Gly Phe Val Leu Ser Ala Gin Val Ser Pro 

410 415 420 

Arg He Thr Phe Lys Pro He Ala Pro Ser Gly Gin Asn Ala Thr 

425 430 435 

Cys Thr Arg Tyr Ser Tyr Cys Asn Glu Cys Met Leu Asp Pro Asp 

440 445 450 

Cys Gly Phe Cys Tyr Lys Met Asn Lys Ser Thr Val He Asp Ser 

455 460 465 

Ser Cys Val Pro Val Asn Lys Ala Ser Thr Asn Glu Ala Ala Trp 

470 475 480 

Gly Arg Cys Glu Asn Glu Thr Lys Phe Lys Thr Glu Asp He Phe 

485 490 495 

Trp Ala Tyr Asn Phe Cys Pro Thr Pro Tyr Ser Trp Thr Ala Leu 

500 505 510 

Leu Gly Leu He Leu Tyr Leu Val Phe Phe Ala Pro Gly Met Gly 

515 520 525 

Pro Met Pro Trp Thr Val Asn Ser Glu He Tyr Pro Leu Trp Ala 

530 535 540 

Arg Ser Thr Gly Asn Ala Cys Ser Ser Gly He Asn Trp He Phe 

545 550 555 

Asn Val Leu Val Ser Leu Thr Phe Leu His Thr Ala Glu Tyr Leu 

560 565 570 

Thr Tyr Tyr Gly Ala Phe Phe Leu Tyr Ala Gly Phe Ala Ala Val 

575 580 585 

Gly Leu Leu Phe He Tyr Gly Cys Leu Pro Glu Thr Lys Gly Lys 
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590 595 600 

Lys Leu Glu Glu lie Glu Ser Leu Phe Asp Asn Arg Leu Cys Thr 

605 610 615 

Cys Gly Thr Ser Asp Ser Asp Glu Gly Arg Tyr lie Glu Tyr lie 

620 625 630 

Arg Val Lys Gly Ser Asn Tyr His Leu Ser Asp Asn Asp Ala Ser 

635 640 645 

Asp Val Glu 
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<220> 

<221> misc_feature 

<223> Incyte ID No: 7482060CD1 

<400> 19 



Met 


Thr 


Phe 


Gly 


Arg 


Ser 


Gly 


Ala Ala Ser Val 


Val Leu 


Asn Val 


1 








- 5 






10 




15 


Gly 


Gly 


Ala 


Arg 


Tyr 


Ser 


Leu 


Ser Arg Glu Leu 


Leu Lys 


Asp Phe 










20 






25 




30 


Pro 


Leu 


Arg 


Arg 


Val 


Ser 


Arg 


Leu His Gly Cys 


Arg Ser Glu Arg 










35 






40 




45 


Asp 


Val 


Leu 


Glu 


Val 


Cys 


Asp 


Asp Tyr Asp Arg 


Glu Arg Asn Glu 










50 






55 




60 


Tyr 


Phe 


Phe 


Asp 


Arg 


His 


Ser 


Glu Ala Phe Gly 


Phe lie 


Leu Leu 










65 






70 




75 


Tyr 


Val 


Arg 


Gly 


His 


Gly 


Lys 


Leu Arg Phe Ala 


Pro Arg 


Met Cys 










80 






85 




90 


Glu 


Leu 


Ser 


Phe 


Tyr 


Asn 


Glu 


Met He Tyr Trp 


Gly Leu 


Glu Gly 










ft c 






100 




105 


A±a 


HXS 


Leu 


Glu 


Tyr 


Cys 


Cys 


Gin Arg Arg Leu 


Asp Asp 


Arg Met 










110 






lit: 




120 


Ser 


Asp 


Thr 


Tyr 


Thr 


Phe 


Tyr 


Ser Ala Asp Glu 


Pro Gly Val Leu 










125 






130 




135 


Gly 


Arg 


Asp 


Glu 


Ala 


Arg 


Pro 


Gly Ala Arg Gly 


Gly Ser Leu Gin 










140 






145 




150 


Ala 


Leu 


Ala 


Gly 


Ala 


His 


Ala 


Ala Asp Leu Arg 


Gly Ala His He 










155 






160 




165 


Leu 


Ala 


Ser 


Val 


Ser 


Val 


Val 


Phe Val He Val 


Ser Met 


Val Val 










170 






175 




180 


Leu 


Cys 


Ala 


Ser 


Thr 


Leu 


Pro 


Asp Trp Arg Asn 


Ala Ala 


Ala Asp 










185 






190 




195 


Asn 


Arg 


Ser 


Leu 


Asp 


Asp 


Arg 


Ser Arg He He 


Glu Ala 


He Cys 










200 






205 




210 


lie 


Gly 


Trp 


Phe 


Thr 


Ala 


Glu 


Cys He Val Arg 


Phe He 


Val Ser 










215 






220 




225 


Lys 


Asn 


Lys 


Cys 


Glu 


Phe 


Val 


Lys Arg Pro Leu 


Asn He 


He Asp 










230 






235 




240 


Leu 


Leu 


Ala 


lie 


Thr 


Pro 


Tyr 


Tyr He Ser Val 


Leu Met 


Thr Val 










245 






250 




255 


Phe 


Thr 


Gly 


Glu 


Asn 


Ser 


Gin 


Leu Gin Arg Ala 


Gly Val 


Thr Leu 










260 






265 




270 


Arg 


Val 


Leu 


Arg 


Met 


Met 


Arg 


He Phe Trp Val 


He Lys 


Leu Ala 










275 






280 




285 


Arg 


His 


Phe 


He 


Gly 


Leu 


Gin 


Thr Leu Gly Leu 


Thr Leu Lys Arg 










290 






295 




300 


Cys 


Tyr 


Arg 


Glu 


Met 


Val 


Met 


Leu Leu Val Phe 


He Cys Val Ala 










305 






310 




315 


Met 


Ala 


lie 


Phe 


Ser 


Ala 


Leu 


Ser Gin Leu Leu 


Glu His 


Gly Leu 
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320 






Asp 


Leu 


Glu Thr 


Ser 
335 


Asn 


Lvs 


Cys 


Trp 


Trp Val 


. He 
350 


He 


Ser 


Met 


Tyr 


Pro He 


Thr 
365 


Val 


Pro 


Val 


Val 


Ser Gly 


He 
380 


Val 


Leu 


Tyr 


His 


Ser Phe 


Val 
395 


Gin 


Cys 


Ala 


Arg 


Ser He 


Cys 
410 


Leu 


Thr 


Val 


Gly 


Tyr Thr 


Glu 
425 


Met 


Thr 


Leu 


Arg 


Asd Prn 


jr a 

440 


Thr 




Glv 


Val 


Leu Tyr 


455 


Ala 


lit: L. 


Glv 


Glv 


Pro Pro 


Val 
470 


Glu 




TlD 


Cys 


Phe His 


Pro 
485 


Ala 




Ser 


Met 


Ala Val 


Al a 
500 


O CI 


XT J. U 


Gly 


Gly 


Phe Leu 


Arg 
515 


Thr 


Glu 


Gly 


Pro 


Val Asp 


Gly 
530 


Leu 


Asn 


Gly 


Cys 


Lys Asp 


Phe 
545 







Asp 
Met 
Gly 
Leu 
Tyr 
Ser 
He 
Lys 
Ala 
Leu 
Ser 
Gly 
Ala 
Cys 



325 
Phe Thr 

340 
Thr Thr 

355 
Arg He 

370 
Ala Leu 

385 
His Glu 

400 
Val Thr 

415 
Asn Gly 

430 
Lys Pro 

445 
Asp Leu 

460 
Pro Pro 

475 
Thr Leu 

490 
Ser Arg 

505 
Leu Val 

520 
Glu Asn 

535 



Ser He Pro 
Val Gly Tyr 
Leu Gly Gly 
Pro lie Thr 
Leu Lys Phe 
Ser Val Leu 
Pro Cys Pro 
Leu Lys Thr 
Trp Gin Ser 
Asp Pro Leu 
Cys Gly Pro 
Pro Ala Ala 
Leu He Val 
His Pro Phe 



330 
Ala Ala 

345 
Gly Asp 

360 
Val Cys 

375 
Phe He 

390 
Arg Ser 

405 
Gly Thr 

420 
Asp Ala 

435 
His Ser 

450 
Leu Glu 

465 
Thr Arg 

480 
Ala Asn 

495 
Pro Gly 

510 
Ala Ala 

525 
Arg Gly 

540 



<210> 20 
<211> 262 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc„feature 

<223> Incyte ID No: 1578772CD1 

<400> 20 



Met 


Trp 


Gly 


Trp 


Glu 


Ala 


Leu 


Phe Leu Phe Cys Ser Cys Ser Ser 


1 








5 






10 15 


Phe 


Ser 


Leu 


Ala 


Gly 
20 


Arg 


Pro 


Leu Leu Leu His Ser Gly Pro Val 
25 30 


Gly 


Ala 


Ala 


Val 


Ala 
35 


Gly 


Ala 


Leu Leu Leu Leu Ser Ala Gin Gly 
40 45 


Cys 


Pro 


Gly 


Leu 


His 
50 


Gin 


His 


Leu Gin His Ala Pro Gly Val Leu 
55 60 


Pro 


Asp 


Ala 


Gly 


Thr 
65 


Ser 


Thr 


Thr Met Ala His Gin Pro Ser Gly 
70 75 


Leu 


Cys 


Cys 


Val 


Asp 
80 


Gly 


His 


Leu Gly Gly Ser Ser Asp Pro Glu 
85 90 


Cys 


Gly 


Phe 


Gly 


Pro 
95 


Gly 


Cys 


Gly Cys Gly Leu Leu His Asp Asp 
100 105 


Cys 


Gly 


Leu 


Pro 


His 
110 


Pro 


Glu 


Leu Leu Gin Val Pro Gly Leu Cys 
115 120 


He 


Leu 


Ser 


Tyr 


Pro 
125 


Thr 


Pro 


Leu Tyr Phe Gly Thr Arg Gly Gin 
130 135 


Phe 


Arg 


Cys 


Asn 


Leu 
140 


Glu 


Trp 


His Leu Gly Leu Gly Glu Gly Glu 
145 150 


Lys 


Glu 


Thr 


Ser 


Lys 


Pro 


Asp 


Gly Pro Met Val Ala Val Ala Glu 
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155 160 165 



Pro 


Val 


Arg 


Val 


Val 


Val 


Leu 


Asp Phe Ser 


Gly 


Val 


Thr 


Phe 


Ala 










170 








175 










180 


Asp 


Ala 


Ala 


Gly 


Ala 


Arg 


Glu 


Val 


Val Gin 


Leu 


Ala 


Ser 


Arg 


Cys 










185 








190 










195 


Arg 


Asp 


Ala 


Arg 


He 


Arg 


Leu 


Leu 


Leu Ala 


Gin 


Cys 


Asn 


Ala 


Leu 










200 








205 










210 


Val 


Gin 


Gly 


Thr 


Leu 


Thr 


Arg 


Val 


Gly Leu 


Leu 


Asp 


Arg 


Val 


Thr 










215 








220 










225 


Pro 


Asp 


Gin 


Leu 


Phe 


Val 


Ser 


Val 


Gin Asp 


Ala 


Ala 


Ala 


Tyr 


Ala 










230 








235 










240 


Leu 


Gly 


Ser 


Leu 


Val 


Arg 


Gly 


Ser 


Ser Thr 


Arg 


Ser 


Gly 


Ser 


Gin 










245 








250 










255 


Glu 


Ala 


Leu 


Gly 


Cys 


Gly 


Lys 

















260 

<210> 21 
<211> 1373 
<212> DNA 
<213> Homo sapiens 

<220> 

<221> mis cofeature 
<223> Incyte ID No: 1626101CB1 

<400> 21 

cacgcgcggc ctggcggcgg 
gaggaaacgc aacctgggcg 
cccaacgtgc gcttctggat 
acagaattat tagatcctac 
caactattgt gcacaaatga 
gaagcttgga agcggagtct 
ctttttcgac ctgcagcgtt 
ccactgaaag ggatcaagtc 
gcgttcaaca gcatcaatgg 
ctaatggcgg gagccgttgc 
atgaagtatg gcctgactgg 
caagccagtg gaatgaatgt 
gtcatggaca aggaaggcaa 
agagaaacgc tagcatccag 
ttcacctact tttttaaaag 
ttgaaactgt cttgtactgt 
tttccacaga ttggacagat 
gaagaaacag aaatctttta 
tggttcctgc ttgaaaacct 
cttctgggga ccccagaggt 
cgttcccagg ctggcgtctc 
gccccgtctg acacccctgg 
aaggatactg tgaaatcact 

<210> 22 
<211> 3231 
<212> DNA 
<213> Homo sapiens 

<220> 

<221> misc_feature 
<223> Incyte ID No: 2907828CB1 

<400> 22 

ttcggcggct gcggcggctg caacagcttc gggctcgggg ttttggcggc ggcgccggcg 60 
ggctaggctg cgcggtgcgg accccggcgc gcggtccggg ttgctggggc ggcgcgtaag 120 
atgcctctaa tggaggagtt tctgagcagc acccctggcc cagtggcttt gaaagggagc 180 



cggccactct aaccagcgca 
gctcctagga cgcagagacg 
caccgagcgc caatccttta 
aaatgtgttc atttcagttg 
agatgtttcc agccctgcct 
tgcaacagtg catcccgaca 
cctgcctttc atggcgccca 
cgtgatttta cctcaggttt 
aaacagaagt tacacttgta 
ttcttcaact ttcttaggag 
cccttggatt aaaagactct 
ctacatgtcc cgaagtcttg 
tgtcctgggt cattccagaa 
aatagtgctg tttgggacct 
gacccagtat ttcaggaaaa 
cctggcaatg ggactgatgg 
acagtactgt agtcttgaag 
tcacagaggg gtgtaggcgt 
tcccctctcc aggttcggtt 
gtctgtgctg acaaggcgac 
tggggttttt aaggctggct 
ggttgctgag ggaacggttg 
aattaactaa taaacctgtc 



aaatgtccct ggaacaggag 60 
ccgtccccgc cttcattgag 120 
ttcgacgatt tcttcaatgg 180 
aaagtataga aaactcgagg 240 
cggcggacca aaggatacag 300 
gcagcaacct gatccccaag 360 
cggtattttt gtcaatgacg 420 
tcctctgtgc ctacatggca 480 
agccactaga aagatcatta 540 
taatccctca gtttgtccag 600 
tacctgtgat cttcctcgtg 660 
aatccattaa ggggattgcg 720 
ttgctgggac aaaggctgtt 780 
cagctctgat tcctgaagtc 840 
acccagggtc attgtggatt 900 
tgccattttc ttttagtata 960 
agaaaattca gtctccaaca 1020 
gagttttagg tgaatttatg 1080 
tagagaactt tgccacaggt 1140 
ttcagattcc atactgagat 1200 
ggagaagaca gtgggagggt 1260 
gagtggggat cggcctgcga 1320 
tcaagttgag gaa 1373 
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tcaaaccaga gactatttca agccctggat atcatatcct gagggccaca ggagaagaga 240 
acatggctgt gagtttggat gacgacgtgc cgctcatcct gaccttggat gagggtggca 300 
gtgccccact ggctccctcc aacggcctgg gccaagaaga gctacctagc aaaaatggcg 360 
gcagctatgc catccacgac tcccaggccc ccagtctcag ctctgggggt gagagttccc 420 
cctccagccc cgcacacaac tgggagatga attaccaaga ggcagcaatc tacctccagg 480 
aaggcgagaa caacgacaag ttcttcaccc accccaagga tgccaaggcg ctggcggcct 540 
acctctttgc acacaatcac ctcttctacc tgatggagct ggccacggcc ctgctgctgc 600 
tgctgctctc cctgtgcgag gcccccgccg tccccgcact ccggcttggc atctatgtcc 660 
acgccaccct ggagctgttt gccctgatgg tggtagtgtt tgaactctgc atgaagttac 720 
gctggctggg cctccacacc ttcatccggc acaagcggac catggtcaag acctcggtgc 780 
tggtggtgca gtttgtcgag gccatcgtgg tgttggtacg gcagatgtcc catgtgcggg 840 
tgacccgagc actgcgctgc attttcctgg tggactgtcg gtattgcggt ggcgtccggc 900 
gcaacctgcg gcagatcttc cagtccctgc cgcccttcat ggacatcctc ctgctgctgc 960 
tgttcttcat gatcatcttt gccatcctcg gtttctactt gttctcccct aacccttcag 1020 
acccctactt cagcaccctg gagaacagca tcgtcagtct gtttgtcctt ctgaccacag 1080 
ccaatttccc agatgtgatg atgccctcct actcccggaa cccctggtcc tgcgtcttct 1140 
tcatcgtgta cctctccatc gagctgtatt tcatcatgaa cctgcttctg gctgtggtgt 1200 
tcgacacctt caatgacatt gagaaacgca agttcaagtc tttgctactg cacaagcgaa 1260 
ccgctatcca gcatgcctac cgcctgctca tcagccagag gaggcctgcc ggcatctcct 1320 
acaggcagtt tgaaggcctc atgcgcttct acaagccccg gatgagtgcc agggagcgct 1380 
atcttacctt caaggccctg aatcagaaca acacacccct gctcagccta aaggactttt 1440 
acgatatcta cgaagttgct gctttgaagt ggaaggccaa gaaaaacaga gagcactggt 1500 
ttgatgagct tcccaggacg gcgctcctca tcttcaaagg tattaatatc cttgtgaagt 1560 
ccaaggcctt ccagtatttc atgtacttgg tggtggcagt caacggggtc tggatcctcg 1620 
tggagacatt tatgctgaaa ggtgggaact tcttctccaa gcacgtgccc tggagttacc 1680 
tcgtctttct aactatctat ggggtggagc tgttcctgaa ggttgccggc ctgggccctg 1740 
tggagtactt gtcttccgga tggaacttgt ttgacttctc cgtgacagtg ttcgccttcc 1800 
tgggactgct ggcgctggcc ctcaacatgg agcccttcta tttcatcgtg gtcctgcgcc 1860 
ccctccagct gctgaggttg tttaagttga aggagcgcta ccgcaacgtg ctggacacca 1920 
tgttcgagct gctgccccgg atggccagcc tgggcctcac cctgctcatc ttttactact 1980 
ccttcgccat cgtgggcatg gagttcttct gcgggatcgt cttccccaac tgctgcaaca 2040 
cgagtacagt ggcagatgcc taccgctggc gcaaccacac cgtgggcaac aggaccgtgg 2100 
tggaggaagg ctactattat ctcaataatt ttgacaacat cctcaacagc tttgtgaccc 2160 
tgtttgagct cacagttgtc aacaactggt acatcatcat ggaaggcgtc acctctcaga 2220 
cctcccactg gagccgcctc tacttcatga ccttttacat tgtgaccatg gtggtgatga 2280 
cgatcattgt cgcctttatc ctcgaggcct tcgtcttccg aatgaactac agccgcaaga 2340 
accaggactc ggaagttgat ggtggcatca cccttgagaa ggaaatctcc aaagaagagc 2400 
tggttgccgt cctggagctc taccgggagg cacggggggc ctcctcggat gtcaccaggc 2460 
tgctggagac cctctcccag atggagagat accagcaaca ttccatggtg tttctgggac 2520 
ggcgatcaag gaccaagagc gacctgagcc tgaagatgta ccaggaggag atccaggagt 2580 
ggtatgagga gcatgccagg gagcaagagc agcagcgaca actcagcagc agtgcagccc 2640 
ccgccgccca gcagccccca ggcagccgcc agcgctccca gaccgttacc tagcccagcg 2700 
cccgaaagcc gtctcttcta tgcaataaca caatagtatt actctactgc gatgtacgga 2760 
actgcggtgt gtgtacacat actcacgtat atgcacatat ttatatacag gaagaaaaaa 2820 
gacagacaag atggggcttg gtttataacc accttgccct gtcttcctta actccagaag 2880 
ccagtttggt gaggggtggg ggtgcggcca ccaggtctga gctcttccta ctgtggaagg 2940 
ctccagaagg cccttcacaa ggagacccct cacctggatc cagtcgactg cggggcttgc 3000 
ccctcatgtg ggctggcctc catcggccac gtccaaagct gtcactgcta ctgcttcagg 3Q60 
ctcacatccc cccgacctga tggcgtgccc gccccctctc cctgcggccc atgccacagg 3120 
tttctgtgtt ttgctttagg gacagaacca cttaggaagg aaagaactcc cggtctccag 3180 
ggtggtattt cagtgtctgt gataatgtca cgcaacacct cttcggggac c 3231 

<210> 23 

<211> 3160 

<212> DNA. 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 3968527CB1 

<400> 23 

atgacggaca acatcccgct gcagccggtg cgccagaaga agcggatgga cagcaggccc 60 
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cgcgccgggt gctgcgagtg gctgagatgc 
gtctggctgg ggcaccccga gaagagagac 
cagaagtaca atttcttcac ctttcttcct 
ttcaacctct atttcttact tcttgcctgc 
gcactctata cctactgggt tcccctgggc 
gcggtggagg agatccgatg ctacgtgcgg 
cggctcacag cacgaggcac agtgaaggtg 
atcatcgttg aaaagaacca gcgggtccct 
aaaaacgggt catgcttctt gcggacggat 
cggcttcccg tggcctgcac gcagaggctc 
tcgtatgtgt acgcagaaga gccaaatatt 
cgagaagaca gcgacccccc gatcagcgag 
ggcactgtgg tcgcatcagg tactgttgtg 
cggagtgtca tgaatacctc aaatccccga 
aactgcctca ccaagatcct ctttggtgcc 
cttcagcact ttgcaggccg ttggtacctg 
aacatcatcc ccattagttt gcgcgtgaac 
gtgattcgaa gggactcaaa aatccccggg 
cagctgggca ggatttcgta cttactcaca 
atgattttca aacggctcca tctcggaaca 
gtacaaagcc acattttcag catttacacc 
ggcccaacgc tcaccactaa ggtccggcgg 
aaggccatcg cgctctgcca caacgtgact 
caggctgagg ccgagaagca gtacgaagac 
gatgaggtgg ccctggtaca gtggacggaa 
cagtcttcca tgcagctgag gacccctggc 
atcttccctt tcacctatga aagcaaacgt 
ggagaaatta cgttttacat gaagggagca 
aatgactggt tggaggaaga gtgtggcaac 
gtggcaaaga agtctcttgc agaggagcag 
gccaagctga gtgtgcacga ccgctccctc 
atggagatgg aactgctgtg cctgacgggc 
cccacgctgg agaccctgag gaatgctggc 
ctggagacag ctacgtgcac agcgaagaat 
cacgtttttc ggctggtgac caaccgcggg 
aggaagcatg attgtgccct ggtcatctcg 
tatgagtacg agttcatgga gctggcctgc 
gcccccaccc agaaggccca gatcgtgcgc 
tgtgcagtag gggacggagg caatgacgtc 
ggagtggaag gaaaggaagg aaaacaggct 
tttaagcatc ttggccggtt gcttatggtg 
gccctcagcc agttcgtgat tcacaggagc 
tcctccgtgt tttactttgc ctccgtccct 
tccacaattt acaccatgtt tcctgtgttt 
gaagttgcca tgctgtatcc tgagctctac 
tacaagacat tcttaatatg ggttttgatt 
ggggcgctgc tgctgtttga gtcggagttc 
ctgatcctca ccgagctgct catggtggcg 
acagtggcgg agctgctcag cctggcctgc 
ttcatcgatg tgtacttcat cgccaccttg 
ctggtcagct gcctccccct ctatgtcctc 
agctactcaa agctcacatc ataggccgtg 

<210> 24 

<211> 2848 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7472732CB1 

<400> 24 



tgcggtggag gggaggccag gccccgcact 120 
cagaggtatc ctcggaatgt catcaacaat 180 
ggggtgctgt tcaaccagtt caaatacttt 240 
tctcagtttg ttcccgaaat gagacttggt 300 
ttcgtgctgg ccgtcactgt catccgtgag 360 
gacaaggaag tcaactccca ggtctacagc 420 
aagagttcta acatccaagt tggagacctt 480 
gccgacatga tcttcctgag gacatcagaa 540 
cagctggatg gggagacgga ctggaagctg 600 
cccacggccg ccgaccttct tcagattcga 660 
gacattcaca acttcgtggg aacttttacc 720 
agcctgagca tagagaacac gctgtgggct 780 
ggtgttgttc tttacactgg cagagaactc 840 
agtaagatcg gcctgttcga cttggaagtg 900 
ctggtggtgg tctcgctggt catggttgcc 960 
cagatcatcc gcttcctcct cttgttttcc 1020 
ctggacatgg gcaagatcgt gtacagctgg 1080 
accgtggttc gctccagcac gattcctgag 1140 
gacaagacag gcactcttac ccagaacgag 1200 
gtagcctacg gcctcgactc aatggacgaa 1260 
cagcaatccc aggacccacc ggctcagaag 1320 
accatgagca gccgcgtgca cgaagccgtg 1380 
cccgtgtatg agtccaacgg tgtgactgat 1440 
tcctgccgcg tataccaggc atccagcccc 1500 
agtgtgggct taaccctggt gggccgagac 1560 
gaccagatcc tgaacttcac catcctacag 1620 
atgggcatca tcgtgcggga tgaatcaact 1680 
gatgtggtca tggctggcat tgtgcagtac 1740 
atggcccgag aagggctgcg ggtgctcgtg 1800 
tatcaggact ttgaagcccg ctacgtccag 1860 
aaagtggcca cggtgatcga gagcctggag 1920 
gtggaggacc agctgcaggc agatgtgcgg 1980 
atcaaggttt ggatgctgac aggggacaag 2040 
gcacatctgg tgaccagaaa ccaagacatc 2100 
gaggctcacc tcgagctgaa cgccttccgc 2160 
ggagactccc tggaggtttg cctcaagtac 2220 
cagtgcccgg ccgtagtctg ctgccgatgt 2280 
ctgcttcagg agcgcacggg caagctcacc 2340 
agcatgattc aggaatctga ctgcggcgtg 2400 
tcgttggctg cagacttctc catcactcaa 2460 
catggccgga acagctacaa gcggtcagcc 2520 
ctctgtatca gcaccatgca ggctgtcttt 2580 
ctctatcaag gattcctcat cattgggtac 2640 
tctctggtcc tggacaaaga tgtcaaatcg 2700 
aaggatcttc tcaagggacg gccgttgtcc 2760 
agcatctatc aagggagcac catcatgtac 2820 
gtgcacatcg tggccatctc cttcacctcg 2880 
ctgaccatcc agacctggca ctggctcatg 2940 
tacatcgcct ccctggtgtt cttacacgag 3000 
tcattcttgt ggaaagtctc cgtcatcact 3060 
aagtacctgc gaagacggtt ctctcccccc 3120 
cgttcgctgg 3160 
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cttaacactg aacccattac ttttccaaga ccagaaaaaa atattacatg aacaggaact 60 

acttctcctt cagataagaa ttcaagcttt gacattgtaa accacagacg aattggagct 120 

cggcattgaa aggaggtgtt ctgcaatgat tttttttctt gtttagagaa gtttacttct 180 
acaagaagaa atctgaaaaa tgacaggagc aaagaggaaa aagaaaagca tgctttggag 240 
caagatgcat accccccagt gtgaagacat tatacagtgg tgtagaaggc gactgcccat 300 
tttggattgg gcaccacatt acaatctgaa agaaaacttg cttccagaca ctgtgtctgg 360 

gataatgttg gcagttcaac aggtgaccca aggattggcc tttgctgttc tctcatctgt 420 
gcacccagtg tttggtttat atgggtctct gtttcctgcc ataatttatg ccatatttgg 480 

aatgggacat catgttgcca caggcacctt tgccttgaca tccttaatat cagccaacgc 540 

cgtggaacgg attgtccctc agaacatgca gaatctcacc acacagagta acacaagcgt 600 

gctgggctta tccgactttg aaatgcaaag gatccacgtt gctgcagcag tttccttctt 660 

gggaggtgtg attcaggtgg ccatgtttgt gctgcaactg ggcagtgcca catttgtggt 720 

cacagagcct gtgatcagcg caatgacaac tggggctgcc acccatgtgg tgacttcaca 780 

agtcaaatat ctcttgggaa tgaaaatgcc atatatatcc ggaccacttg gattctttta 840 

tatttatgca tatgtttttg aaaacatcaa gtctgtgcga ctggaagcat tgcttttatc 900 

cttgctgagc attgtggtcc ttgttcttgt taaagagctg aatgaacagt ttaaaaggaa 960 

aattaaagtt gttcttcctg tagatttagt tttggctcca aacacatcgc cactccatca 1020 

ccactacgac tgtctctttg ccaactttct tgagccaccc tgggaggatg gacttccaga 1080 

aggtgccttc aaccaggcag aaggacattt gcgcaggaac ataattccct cacctagagc 1140 
tcccccgatg aacatcctct ctgcggtgat cactgaagct ttcggagtgg cacttgtagg 1200 

ctatgtggcc tcactggctc ttgctcaagg atctgccaaa aaattcaaat attcaattga 1260 

tgacaaccag gaatttttgg cccatggcct cagcaatata gtttcttcat ttttcttctg 1320 

cataccaagt gctgctgcca tgggaaggac ggctggcctg tacagcacag gagcgaagac 1380 

acaggtggct tgtctaatat cttgcatttt cgtccttata gtcatctatg caataggacc 1440 

tttgctttac tggctgccca tgtgtgtcct tgcaagcatt attgttgtgg gactgaaggg 1500 

aatgctaata cagttccgag atttaaaaaa atattggaat gtggataaaa tcgattgggg 1560 

aatatgggtc agtacatatg tatttacaat atgctttgct gccaatgtgg gactgctgtt 1620 

tggtgttgtt tgtaccatag ctatagtgat aggacgcttc ccaagagcaa tgactgtaag 1680 

tataaaaaat atgaaagaaa tggaatttaa agtgaagaca gaaatggaca gtgaaaccct 1740 

gcagcaggtg aaaattatct caataaacaa cccgcttgtt ttcctgaatg caaaaaaatt 1800 

ttatactgat ttaatgaaca tgatccaaaa ggaaaatgcc tgtaatcagc cacttgatga 1860 

tatcagcaag tgtgaacaaa acacattgct taattcccta tccaatggca actgcaatga 1920 

agaagcttca cagtcctgcc ctaatgagaa gtgttattta atcctggatt gcagtggatt 1980 

tacctttttt gactattctg gagtctccat gcttgttgag gtttacatgg actgtaaagg 2040 

caggagtgtg gatgtattgt tagcccattg tacagcttcc ttgataaaag caatgacgta 2100 

ttatggaaac ctagactcag agaaaccaat tttttttgaa tcggtatctg ctgcaataag 2160 

tcatatccat tcaaataaga atttgagcaa actcagtgac cacagtgaag tctgagaccc 2220 

ttttgtcaca gtacagctct tgtctttacc aactgcctga agaggccata tgctggcatt 2280 

ttgcacaact ttttggttgt ttagatccta cagatgacct ctgctacaat aagtacgatg 2340 

tgacttagta actgcatagc agttggaaag aactgccaac ttttttttct catttttgtt 2400 

agtaagaaga ttcgcttagt tattttatgt aaaaatcagt atgtgtttag ttttagtgta 2460 

ctgaagggta aacatggttt tattttattt taccatatta ttttgtgttg ttttatttct 2520 

attgtgctgt aagttgatgt ttaaaattga gaaatacttt tgtcataggt aatttggaac 2580 

atttacaagc catttgtaaa attttaagat aatctgtaac taatacataa aaacaactta 2640 

gcaaatgtgc cattttcaca caacttctct ctgtataggc ctctgaaata tcaataaggc 2700 

taaatattac tttacacagt aagatgtgaa attcacaaaa agtaaaccaa actaaacgaa 2760 

tgaaaaactg gaaataattc gtttccatat ctttccatac gtccatttct gaagtattca 2820 

ggaatgtttt cataatcgaa agaaacgg 2848 

<210> 25 

<211> 3727 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7476938CB1 

<400> 25 

atggttatgg aggctgggga gtccaagggc atagtgctgt catctggcaa gggccttcat 60 

gctgcatcat tcatggtgga aggtgaaaac gtaagagaag ggattggctc agaaatgggc 120 

acctgcccca agtggaccaa tgtttctcat tgcaaaatgg gaataatgcc agttttggtt 180 

aagggcttcg tgctgagcgg aagccggaag caaaagcggg tcctgctagc cccgcggctc 240 
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cgaactcggt ggtcctggaa gctccgcagg 
ttccccaata caactcatga gggtttcaat 
acgaaactgg tgctcccgac ccctggcaag 
gcccagcaag aggagcagtc cagcggcatg 
atctgcatca tattggtgca tttactgatc 
gttgctgttg tttctttagg tattctcatg 
aaactggcga attggaagga agaagaaatg 
cttcccccta ttatctttga gtctggatat 
attggttcca tcaccctgtt tgctgttttt 
ggaggaattt attttctggg tcaggctgat 
tttgcgtttg gctccctaat atctgctgtc 
gcacttcatg tggaccccgt gctcaacatg 
gcagtctcca ttgttctgac caacacagct 
gtcagtgggt ggcaaacatt tttacaagcc 
tctgcagcgc tcggcactct cactggctta 
ttgaggaaaa cgccttcctt ggagtttggc 
gggcttgcag aaggaatctc actctcaggc 
atgtcccact acacgcacca taacctctcc 
ctccgcaccg tggccttctt atgtgaaaca 
tttagttttc ctcacaagtt tgaaatttcc 
tttggcagag cggtaaacat tttccctctt 
aaaatcacac cgaagatgat gttcatcatg 
tatgccctga gcctacacct ggacctggag 
accaccatcg tcatcgtgct cttcaccatc 
attcgcctca tggacatcga ggacgccaag 
ctcagcaaga ctgagaagat gggcaacact 
gaggaggagt acgaggccca ctacatcagg 
gacgccaagt acctgaaccc cttcttcact 
gggcgcatcc agatgaaaac tctcaccaac 
tccggctccg aggacgacga gcaggagctg 
ggcaggccca ggatgggcgt ttgctgcgca 
cgtgcatcca gcagcccctt caagacataa 
cgccttagtc cagaacctga caggcctctg 
ctcccgactc ctccctgagc cagcctccgc 
gggagcatgg ggccaggtgc cagtcatctg 
aggacccctg cggccccctg cctagaggag 
actgccttca tgctgccccc gccggactgg 
gtcatcaaga tgcctctgca gccacaattc 
aaaacctccc gctgcctttt gtgatacttc 
cttttgtcct ttacctgatt ggcacttcgc 
tgcccttctc tgggcatgtt ctgaatgttt 
gccccctgca agctgcaact ctaggctttt 
gaaaagctag aggcacaggg tttctgccgg 
cagcaaagtg ctgagagcct ctagtcgcct 
ccgtactcag ttgttctttt gtctaatcgg 
gctcactgct gccatcttcg ctgctagtca 
ccctaccacg ttggatccca ttcgtcaccc 
ggccagagca gcagcaccca gtgctccctc 
ggagcacacg ctccacgcac acacacccca 
catctcaggg tgaggagctg ccagtcatgt 
tctccccttt gacgagcctc aaactgctca 
atgtggttct gggtcccagg gagccttgga 
attaagaagc attcctgctt ctcaagggac 
tgggctgatc atgtgcattc ctgcttctct 
tggaccctgg gctagagcaa gcacatctcc 
gatgtcagga gggactgacc tcaggacctt 
caggcccgat cctaccacct cgccttgacc 
agcacactgt ttactttttg catgaaaagt 
tcttttt 



atgggggaga agatggcgga agaggagagg 300 
gtcaccctcc acaccaccct ggttgtcacg 360 
cccatcctcc ccgtgcagac aggggagcag 420 
accattttct tcagcctcct tgtcctagct 480 
cgatacagat tacatttctt gccagagagt 540 
ggagcagtta taaaaattat agagtttaaa 600 
tttcgtccaa acatgttttt cctcctcctg 660 
tcattacaca agggtaactt ctttcaaaat 720 
gggacggcaa tctccgcttt tgtagtaggt 780 
gtaatctcta aactcaacat gacagacagt 840 
gatccagtgg ccactattgc cattttcaat 900 
ctggtctttg gagaaagtat tctcaacgat 960 
gaaggtttaa caagaaaaaa tatgtcagat 1020 
cttgactact tcctcaaaat gttctttggc 1080 
atttctgcat tagtgctgaa gcatattgac 1140 
atgatgatca tttttgctta tctgccttat 1200 
atcatggcca tcctgttctc aggcatcgtg 1260 
ccagtcaccc agatcctcat gcagcagacc 1320 
tgtgtgtttg catttcttgg cctgtccatt 1380 
tttgtcatct ggtgcatagt gcttgtacta 1440 
tcctacctcc tgaatttctt ccgggatcat 1500 
Cggtttagtg gcctgcgggg agccatcccc 1560 
cccatggaga agcggcagct catcggcacc 1620 
ctgctgctgg gcggcagcac catgcccctc 1680 
gcacaccgca ggaacaagaa ggacgtcaac 1740 
gtggagtcgg agcacctgtc ggagctcacg 1800 
cggcaggacc ttaagggctt cgtgtggctg 1860 
cggaggctga cgcaggagga cctgcaccac 1920 
aagtggtacg aggaggtacg ccagggcccc 1980 
ctctgacgcc aggtgccaag gcttcaggca 2040 
cagacactca gcaggggcct cgcagagatg 2100 
gagggcgggg cgaggtactg gctgcagagt 2160 
gagccaggcg acttcttggg aaactgtcat 2220 
tcagtgtggc tcctcagccc acagagggga 2280 
tgaagctagg gcgcctaccc ccccacccgg 2340 
caccatctac agttgtgcca ttccccagcc 2400 
cagagccagg ggtcagccac ctgcctttga 2460 
tgacctaagt ggcagggccc agaaatcctg 2520 
ctgtgctccc tcagagagaa acggagtgac 2580 
agtctatctc cctgggtagc agacggctgc 2640 
acactggtac cttctggtat cttctttaga 2700 
atcttgcggg gtcagagcgc cctctagagg 2760 
cccacaactg ctgtcttgat ttgcatttta 2820 
cctgccatct gatctccctc cccaccattc 2880 
aggccactgt gctgaggccc tgcagtgtct~2940 
gggttccatc ctctttcccc tctcccagtt 3000 
atgctagggt ccccaaagca ctggggcagg 3060 
ctctactctg acctggggcc ccagcatcct 3120 
gccctgtccc aggggcctgg ccccctcagc 3180 
ccagatggaa tgactcccat cctctcctca 3240 
gctcatcaaa gagccattgc caacttccgt 3300 
acctggcacc ctggggtggt ttaattcatc 3360 
acagtggcct gcatgggcca gcatggaccc 3420 
ggggacacag tgggcccaca tgggccagca 3480 
atctcttcca cctcaggcag tgtggctcca 3540 
ccaggttcct ctgtgccagg aatgagaggc 3600 
ctgaagtcag agcaggccag ccaagcagga 3660 
aaatgtgtac ttgatagagc taaaatatga 3720 

3727 



<210> 26 
<211> 2571 
<212> DNA 



34/48 



WO 02/40541 PCTYUS01/46055 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 8128531CB1 

<400> 26 

ttaaagctgg acagaatttt taaaagcaat gaagccagtt ccttggatat atccacgggc 60 
tttgctttga gaaggaactg agtaggcagt gagaagagtc gagtgacgcc tggcccgtga 120 
gtgcctcaac aactgagatg aacgtcgact cgcttgcagg caagttgtca gggaggaagc 180 
cacagcccaa ggaggtcgtc acttgccggg aaggtggctc gggccaggct gcactcaaaa 240 
cccgtgctct gtccacactg ctacggggcc agagccaagg aagcttccac ttcttccccc 300 
agacagcccc aacagcggct accccaagga gccagcagcc ttgtgtcctg ggatccccag 360 
cccctgcaga atgacccacc aggatctgag catcacagcc aaactcatca atggaggtgt 420 
agcagggctc gtgggggtga cctgcgtgtt ccccatcgac ttggccaaga ctcgcctgca 480 
gaaccagcat gggaaagcca tgtacaaagg aatgatcgac tgcctgatga agacggctcg 540 
ggcggagggc ttcttcggca tgtaccgagg ggctgcagtg aacctcactc tggtcactcc 600 
agagaaggcc atcaagctgg cggccaacga ctttttccgg cggctgctca tggaagatgg 660 
gatgcagcgg aacctgaaga tggagatgct tgccgggtgt ggggctggga tgtgccaggt 720 
cgtggtgacc tgtcccatgg aaatgctcaa gattcagctg caggatgctg gacgcctggc 780 
cgtccatcat cagggctcgg cctcagcacc ctccacctcc aggtcctaca caactggttc 840 
ggcttccacc cacaggcgcc cctctgccac cctcattgcc tgggagctgc tccgcactca 900 
gggcctggct gggctctaca ggggcctggg tgccactctc ctcagagaca ttcctttctc 960 
catcatctac fctcccactgt ttgccaacct taacaacctg gggttcaacg agctcgccgg 1020 
taaggcgtcc tttgcacatt ccttcgtgtc aggctgtgtg gcaggttcca tagctgcggt 1080 
cgcagtgacg cctctagatg ttctgaaaac tcgaatccaa accctcaaga aaggcctggg 1140 
cgaggacatg tacagtggga tcaccgactg tgccaggaaa ctctggattc aggagggacc 1200 
atctgccttc atgaaaggcg ctggctgccg ggcactggtc atagcacctc tctttgggat 1260 
tgctcaaggg gtctatttta ttgggattgg agagcgcatc ttaaagtgtt ttgactagac 1320 
agagctggag gtcaagtccc tgcgcttgcc gccctctctc tagctgtttc acttagccta 1380 
gagggggcaa gggcaggtgg ggccactctg gcctgcctgg tcctctgcgt tgtagtgcta 1440 
cctcaatctc gggagaaaca gccctatatt ctaacaagtt gagcacagcc ttcttcccct 1500 
tcgtgtctac actcgttttc ctttgtgggc acagctacca ggggcttttg gaagccccta 1560 
accacctact tttcaacaaa aatggtactt tcgttgtatt aattgcagga ccttaacagg 1620 
tagtcacaat agaagggttg tttctgtatt ttaacatttc tatttcacag tcaaactcgg 1680 
cattcttcag tcagcttgag gatttagcat tgttaatctt ggactccata acttatgagt 1740 
cctagcactg attttgagga aaaggaggat cagaagttca agggaccgtg aaagccctca 1800 
gagtcagcac ctagtttgag accaagcacc ctttcgaatc cctggatggc tgagggggct 1860 
gaggccggct ctgactgggc age t cage cc ctcccccaga gcccagggtc ttgcacaccc 1920 
ctccctgtaa ccaaggaaca ctctgaaata aaggtgaatg gctaaaatct catctgttca 1980 
tcagtgggta cagcagatag gctgcagtga atgetatcac catctacttt tctacgtcca 2040 
ttfccaaaacc aaacattaaa aagggcatag aagcagaccc ccgtcactct tcaaactgtt 2100 
acttgtgggg gtggaggaac acagecatag ggaaatatct gcttgttagt gacactgggt 2160 
tttaagcett gattctatcc cttcataagt gaategtett gaggagctga gtttgctgtg 2220 
agagccctcc tcacgeacct cgattcctcc cccaaaggct gctacaggag agataatgtc 2280 
acagcagcag ggccaagtcc taagaaaatc agcacctgct gcaggagctg gtgtttacaa 2340 
tagtcccatc tactgtgaaa cctgggctaa caaggaagag gatggtgcta acatggtcag 2400 
ccctgggggc ctcactctct gttatgagaa ctgcatttga gtatgggccc tggagacaga 2460 
cctcagttca agtcccagct ccaccatgta etagctgeaa ggccctgggc agctcttagt 2520 
cgtcacctac ggaaaaataa aacatgggac agggaaggaa gaacagggee t 2571 

<210> 27 

<211> 1660 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc„feature 

<223> Incyte ID No: 7476757CB1 

<400> 27 

ggagttctgg gctgtagtgc getatgeega tcgggtgtcc gcactaagtt eggcatcaat 60 
atggtgacct cccgggagcg ggggaccacc aggtcgcctc tcattctgta caaaggtgtg 120 
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ctatggcatt ggtggggtcc ccaaccagat agcctccagc gccacagcct tttacctgca 180 
gcttttcctg cttgatatag cacagatccc tgccgcccag gtgtcacttg ttctgtttgg 240 
gggaaaggtg tctggggcgg ctgctgaccc tgtggctggg ttcttcatca acaggagcca 300 
gaggacaggg tctggacggc tcatgccttg ggtgctgggc tgcaccccct tcatcgccct 360 
ggcctacttc ttcctgtggt tcctgccccc cttcaccagc ctgcgaggcc tctggtacac 420 
gactttctac tgcctgttcc aggccctggc cacgttcttc caggtgccct acacagcgct 480 
caccatgctg ctgactccct gcccaaggga gcgggactcg gccaccgcca taccggatga 540 
ctgtggagat ggcgggaaca ctgatggggg ccactgtcca cgggctcatc gtgtccggcg 600 
cccacagacc ccacaggtgc gaggccactg cgaccccggg gccagtcact gtctccccga 660 
atgcagccat ctctactgca ttgcggctgc cgtggttgta gtgacttacc ccgtgtgcat 720 
cagtttactg tgcctagggg tgaaggagcg gccaggtttt gcttttgaac tctgcgaagc 780 
caaggtgaca cgcttctgcg ttgcagaccc ctctgcccca gcctcaggcc caggcttgag 840 
tttcctggct gggctgagcc tcactacccg gcacccaccc tacctgaagc tggtgatctc 900 
cttcctgttc atctctgctg ctgttcaggt ggagcagagc tacctggtcc tgttctgtac 960 
acatgcctcc cagctacacg accacgtcca gggcctggtc tcagccgtgc tgagcacccc 1020 
gctgtgggag tgggttctcc agcgctttgg gaagaagacg tcagcctttg ggatctttgc 1080 
gatggtgccc tttgcgatct tgctggctgc tgtgcccaca gcacctgtgg catatgtcgt 1140 
ggcctttgta tctggcgtga gcattgctgt gtccttgctg ctaccctggt ccatgctgcc 1200 
agacgtggtg gatgactttc agctgcagca ccgtcacggg ccaggcctgg agaccatctt 1260 
c tact cc tec taegtcttet tcaccaagct gtctggcgca tgtgccctgg gcatctccac 1320 
cctcagtctg gagttctegg ggtataaggc aggggtctgc aagcaagcag aggaggtggt 1380 
ggtcaccctc aaagtcctca ttggcgccgt- gcccacctgc atgatccttg ctgggctctg 1440 
catcctcatg gtcggctcca ctccaaagac acccagtcgg gacgcctcca geeggctgag 1500 
ectteggaga cgtgcacaag cacccaatgt tcacacaagt aaggtccacg ageatgeaca 1560 
tatcatgeag gcccacgcgg gacaggcagt gggtggcctt gtcatcagcc actccctgct 1620 
gagggtgacg gcctcgggct ctgeagcaga gagatactga . 1660 

<210> 28 

<211> 2743 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 266243CB1 

<400> 28 

atggcggcgg ccgcggtggg cgcgggccac ggcgcggggg gcccgggcgc agegagcage 60 
agtggtgggg cgcgcgaggg cgcgcgggtg gcggcgctgt gcctgctgtg gtacgcgctg 120 
agegegggeg gcaacgtggt caacaaggtg atectgageg ccttcccgtt cccggtgacc 180 
gtgtcgctgt gccacatcct ggctctgtgc gctgggctcc cgccgctgct gcgcgcctgg 240 
cgcgtgcccc ccgcgccgcc cgtctcgggc cccggaccca gtccgcatcc gtcgtccggc 300 
ccgctgctgc cgccgcgctt ctacccgcgc tacgtgctac cgctcgcctt eggcaagtae 360 
ttcgcgtccg tgtcagegea cgtcagcatc tggaaggtgc ccgtgtccta tgcacacacc 420 
gtcaaggeca ccatgcccat ctgggtggtc ctcctgtccc ggatcattat gaaggagaag 480 
cagagcacca aggtatactt gtcactcatc cccatcatca gcggtgtcct gctggccacc 540 
gtcaccgagt tgtcttttga catgtgggga ctcgtcagcg ccctcgccgc cacgctgtgc 600 
ttctcgcttc agaacatttt ctccaaaaag gtcttgegag attcaeggat ccaccatctc 660 
cggctgctca acatcctggg ctgccacgcc gtcttcttta tgatccccac ctgggttctg 720 
gtggacctct cggctttcct ggtcagcagc gacttgacct acgtctacca gtggccctgg 780 
acgctcctgc tcctggctgt cagcggcttc tgtaactttg cccagaatgt tatcgccttc 840 
agcatcctca acctegttag cccgctgagc tacteggteg ccaatgccac caaaagaatc 900 
atggtcatca cggtgtccct gatcatgetg cgcaacccag tcaccagcac caacgtcctg 960 
ggcatgatga ccgccatcct gggggtcttc ctctataaca agaccaagta egatgeaaac 1020 
cagcaagcca ggaagcacct cctccccgtc accacagcag acctgagcag caaggagegt 1080 
caccggagcc cactggagaa gccccacaac ggcctcctct tcccccagca eggggactat 1140 
cagtacggcc gcaacaacat cttaacagac cacttccaat acagccggca gagctaccca 1200 
aactegtaca gtttgaaccg ctatgatgtg tagagtccaa aggacaggac cagactgttg 1260 
gtgactcctt ccccggcccc cacagcagta tcagaaactt ctgacaatca gtgaatgtac 1320 
aacccagccg aggggaeggt gcataactct ccatcagaag ccctggggtt cctggccccc 1380 
cgtgagccgc aggaggatgc gttgectgea gtgeagaegg ccgtgagctc tgggcaaacc 1440 
taaacagaga ccagtgtctc atgetcttte ttcctggagt ctgtcatctg agggccgtgt 1500 
ccctgcggag atettggeca cgttgtacct ttccatgtgg aattattccc caagcagtgt 1560 
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agctcagagc acttgtgtct gcattccaga taacattcag gacctgtgtg aaaagctggg 1620 
gtcactgtgg ctgtagacca tgaactggca gtgggggtgt ccagggcggt gcttgagaac 1680 
gtcagactgg ctagtttaat tccctggcgc agatacgcat aggaccaaca gggtcaccaa 1740 
gcagacaggg agcccgcgag aatcattcaa aacatcccca gccacagaga tggatccagt 1800 
ttcctggtca tccccttagc agttcacaag ttcctggcaa atgttccaaa gcaaaaagcg 1860 
attgcaatta gcatccagtt cctgcagcct ggtgctctgc cctgcacgtc agggttggca 1920 
tccacccaga tccagatgga agggaaactt ctctcttctc ctttgcctcc tcttccctca 1980 
ccagagcagg gcgcttctct tggggtggtg agaaggatct tcgagaaatc gtgttcagta 2040 
tttcaagctc tatttctgtg gcacatgtct tttgagaggc atcttcacct cttctgtgat 2100 
gacttggtat gttgtttggt agagagatct tgattttcgg aggatcttgc atttttctag 2160 
ggaatatttt gtagttgtgt gtgtgtgttt ttgccttggt ccccattatg ggatgcatta 2220 
ggactggcct atgcatcgaa aatctttttg tttgtaaacg tttaaaaaca aagttccccg 2280 
gccaggcaca gtggctcaca cctgtagtcc cggcactttg ggaggccaag atgggcggat 2340 
cacgaggtca ggagttcgag accagcctgg ccaacatggt gaggccccgt ctctactagg 2400 
agtacagaaa ttagccgggc atggtgtcgc gtgcctgtgg tcccagctcc tcgggctgct 2460 
gaggcaggcg aattgcttga acctgggagt gcagtgatgc gacctcggct gactgcaacc 2520 
tctacctccc gggttcaaac aattgtcttg tctcagcctc ccaagtagct gggattacag 2580 
gtgtgcacca ccatgcctgg ctaattttta gtagagatgg ggtttcacca tgttggcctg 2640 
gatggtctcg aactcctgac ctcagatgat ccacctgcct tggcctccca aagtgctggg 2700 
attacaggca tgagccacca cacaaccgac cttggccagc aca 2743 

<210> 29 

<211> 3239 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_featiire 

<223> Incyte ID No: 6585710CB1 

<400> 29 

ggcttaaagt aggggcggcc agcacatggt ccattatttc acagccatcg gctacccctg 60 
tcctcgctac agcaatcctg ctgacttcta tgtggacctg accagcattg acaggcgcag 120 
cagagagcag gaattggcca ccagggagaa ggctcagtca ctcgcagccc tgtttctaga 180 
aaaagtgcgt gacttagatg actttctatg gaaagcagag acgaaggatc ttgacgagga 240 
cacctgtgtg gaaagcagcg tgaccccact agacaccaac tgcctcccga gtcctacgaa 300 
gatgcctggg gcggtgcagc agtttacgac gctgatccgt cgtcagattt ccaacgactt 360 
ccgagacctg cccaccctcc tcatccatgg ggcggaggcc tgtctgatgt caatgaccat 420 
cggcttcctc tattttggcc atgggagcat ccagctctcc ttcatggata cagccgccct 480 
cttgttcatg atcggtgctc tcatcccttt caacgtcatt ctggatgtca tctccaaatg 540 
ttactcagag agggcaatgc tttactatga actggaagac gggctgtaca ccactggtcc 600 
atatttcttt gccaagatcc tcggggagct tccggagcac tgtgcctaca tcatcatcta 660 
cgggatgccc acctactggc tggccaacct gaggccaggc ctccagccct tcctgctgca 720 
cttcctgctg gtgtggctgg tggtcttctg ttgcaggatt atggccctgg ccgccgcggc 780 
cctgctcccc accttccaca tggcctcctt cttcagcaat gccctctaca actccttcta 840 
cctcgccggg ggcttcatga taaacttgag cagcctgtgg acagtgcccg cgtggatttc 900 
caaagtgtcc ttcctgcggt ggtgttttga agggctgatg aagattcagt tcagcagaag 960 
aacttataaa atgcctctcg ggaacctcac catcgcggtc tcaggagata aaatcctcag 1020 
tgccatggag ctggactcgt accctctcta cgccatctac ctcatcgtca ttggcctcag 1080 
cggtggcttc atggtcctgt actacgtgtc cttaaggttc atcaaacaga aaccaagtca 1140 
agactggtga ttcacgccag acgtctgccc gctggtgggg gacctgagca gacccttcaa 1200 
ctgcactccc tcctcaggag ccccttcctg gggacagtga ggacaatgac cctacagatg 1260 
ctcagctaca tccggcccag ggtgctgcag tggcacagac cagccacagg atggcagtag 1320 
aataaagaca gtcgaaaggg atttctgctc actggcagga gactgcgatg actgggagaa 1380 
aacctgcact cggtggcacc tacaacgttg ctaatttatt tccttttgat atgcatttat 1440 
ataggcaact cgatatagga tgggagcaaa ctaggaatga attgggtagc tagactgtgc 1500 
aggaattgtt ggaacctgga gggaacaata acagtagcta gcagatttgg cttcatcttc 1560 
caggggcccc acactccgtg gtgagccacc atcaatacag aaagtgacct aagatgtacc 1620 
agcaagatgc catcccttct ttttgtgtgg ggtcatgggc tccaaaagcc aacgtgaaca 1680 
attaaaaatg tattgagcat ctactctgta gcaggtcctg tgaaaacact ttaggtggac 1740 
aatcccttga ggtaagtggt atcccatttt ataggtgtga aaactgaagc aaaaattcat 1800 
tttcctaagg gcacatggat acttgtggtg gagtcatatg gggatcagaa aagcctttga 1860 
ggccttggag ttagagggca gaaggcaagg cctgagccgc tgtaagccct taggagttta 1920 
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ggaaggctcc agaagacaaa tggggtctgt agaggctgtt aactcagcca ggcttcttag 1980 
agttgcattt cactaactga tatggtttgg ctctgcgtcc tcacccaaat ctcaccttga 2040 
attgtaataa tccccaagtg tcaagggcgg gaccagatgg agataattga atcatagggg 2100 
tggtttctct gatgctgttc tcctgagagt gagtgagttc tgatgagatc cgacggtttt 2160 
ataaggggct tccctcttcg ctcggctctc attctctctc ccgctaccct gtgaagagga 2220 
gccttccacc acgactgcaa gtttcctgag gctgccccag ccgtgctgaa ctgtgagtca 2280 
gttaaacctc ttttctttat aaattaccca gtcttgggta tttcttcata gcagtgtgag 2340 
agcagatgaa tacactggcc ctgcctgggt ttcagaacca gccttgaacc tttcacagtg 2400 
gccagaggat ggggaggcag aggcccaggt tgcacacttc ttgcctgagt gttggggact 2460 
atctgaccca aaacaggtgc acagagggca ggagaggatg ttcccaaagg aaaattagag 2520 
tttagaatca aaagaagggg aaggtgcgtg tttgggaggt aaatagcaaa tactcttcat 2580 
aggttcacta gagtcttgct actccaagta cgatccctgg gccagcagaa tgggcacagc 2640 
tggagctgat tggataggtc ccatgagcct caggccccac ccaggctcaa tgagtcagag 2700 
tctgcgtctt aagaagaccc cctggtgatc tgtgcacatt caagcatgct gccgttttcc 2760 
aaagcacttg caacactcag gatgcttgca cggtcatgtt gccaccatcc aacctgcaga 2820 
ccccattctt gagattgact gggagttcct atcatgtcct ccatagcaag gggatctaga 2880 
ccagaatcaa gccttggatc tagttctcaa gtctctttgt ctctttcagt ttaggaacag 2940 
tttgtcaact ttccttcact ttgtgacctt gatacttgag tttgaaggct gtctctcaat 3000 
ttgtgttctg cccagtgcat cctatcagaa ggcatgtgat ttcaacttct cccataccaa 3060 
caacgttcac tttgatcact tgattaaagg ggtgtctgct aggcttctcc acagccaagt 3120 
tactattttc cctcccttta taattaataa gcattttgta agtgggtact ttgaaactat 3180 
.gtaaattgta aactttccat- ttatgcattt taaaattttg attgatgtaa aaaaaaaaa 3239 

<210> 30 

<211> 1615 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> miscJE eature 

<223> Incyte ID No: 7483599CB1 

<400> 30 

atggacaaat tcctagacac atacaatcta ccaagattga accaggaaga aatccaaaac 60 
ctgaagagac caataacaag taatgagatc aaagcaataa taaaaagtct ccagatgtca 120 
ttgcttggaa gggactacaa cagtgagctg aactccttgg acaacggacc tcagtcaccc 180 
tcagagagca gcagtagcat tacttcagag aatgtccatc ctgctggaga agctggacta 240 
tcgatgatgc aaactttgat ccacttgttg aaatgcaaca ttggcacagg gctcctgggg 300 
cttcccctgg ccataaagaa tgccggcttg ttggtcggtc ctgtcagcct tctggccatc 360 
ggggtcctca ccgtgcactg catggtcatc ctgttgaact gbgctcaaca cctcagccag 420 
cctagactgc agaagacttt tgtgaactat ggagaggcca cgatgtacgg ccttgaaacc 480 
tgcccgaaca cctggctgag ggcccatgca gtgtggggaa ggtacactgt cagcttctta 540 
ttagtcatca cccagctggg cttctgcagt gtttatttta tgtttatggc agacaattta 600 
caacagatgg tggaaaaagc ccacgtgacc tccaacatct gccagcccag ggagattctg 660 
acgctgaccc ccatcctgga cattcgtttc tacatgctga taatcctgcc cttcctgatc 720 
ctgttggtgt ttatccagaa cctcaaggtg ctgtccgtct tctcgacatt ggccaacatc 780 
accacccttg ggagcatggc tctgatcttt gagtatatca tggaggggat tccatatccc 840 
agcaacctac ccttgatggc aaactggaag accttcttgc tgttctttgg tacagccatc 900 
ttcacatttg aaggcgtcgg tatggttctg cctctcaaaa accagatgaa gcatccacag 960 
cagttttctt ttgttctgta cttggggatg tccattgtca tcatcctcta tatcttactg 1020 
gggacactgg gctacatgaa gtttgggtca gacacccagg ccagcatcac cctcaacttg 1080 
cccaattgct ggtatgtcct gcccacctca ggtgagatag ggagagacac tggaactgtt 1140 
ctggttgtca tagcagagag cacagcaaag ctgagccatg aagctggtaa tccatcactg 1200 
gaagtgacat atgtctctcc tgctcacact gcatcagtca aagcaagcca catggccgca 1260 
cctcactcca agggggcagg gaagtgcaat tctgccatgt gcctggaagt atttggtgaa 1320 
cagcacaaat aactgctgtg ctacctgatg cagtgaccgt ggggattaat tggattaata 1380 
catagataat gcttagaaaa gtgcttagca cgtggtaagc attcatgagc gttagctatt 1440 
atcattgtta tgcatcccca cagccttcat ttttccaagg tgagtaggat gatggtgcat 1500 
ttatttccca caaatccaga gctgtagaat gagaaaaatg taaccatccc cacccacctt 1560 
gctgtgttat gataatgact agatgagaca ataaatgtgg agtttctttg aaaaa 1615 

<210> 31 
<211> 1245 
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<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 
<223> Incyte ID No: 2507246CB1 

<400> 31 

atggcgacgg gcggccagca 
ggaggcacag ttggtgctat 
tcttcaagat tagctctccg 
ggagctggaa tggtgagacc 
atcttggaga aagagggacc 
gttgcaccat caagggctgt 
ggcattttcg tgcctaacag 
atcacaaatt ccttaatgaa 
aaagtgaggg gctctaagca 
gaaggcattc gtggcttcta 
ataatctgct ttgctattta 
tcttctgcaa atgggactga 
gctctttcta agggctgtgc 
ctccgggaag agggcaccaa 
gaagaaggct accttgcctt 
aatactgcca ttgtgttgtc 
cagtaacagg ccggaaaatt 
tttttcccca ttgatgttta 
ggttcatatc acctgttgga 
cattaacgtt aatagttaat 
accaactaaa ttaaatcatg 

<210> 32 
<211> 4169 
<212> DNA 
<213> Homo sapiens 

<220> 

<221> misc_f eature 
<223> Incyte ID No: 3033505CB1 

<400> 32 

gcacggctca ctgacacgca gctttggtta aagagcgggc gcacaggagg ggaggagacc 60 
gcgcgcggga cggggaggaa tggcctgtcc gcgttaaacc atcacaagcc atggttgcgg 120 
aagggccacg cgtcccccag taggagaatg actccgattc gtgaccctca gcgccggtgc 180 
atgtcgatat atttattgag tgtctactgt gtgccaggca ctatatctat gtgcatagaa 240 
aaaccctgga aggccataca acaatatata tagagtgatc gtctctgctt ' gctgagctaa 300 
caggggtgtc aagcttccat tttggtatct acttctaaat acactcagaa caggagaaat 360 
ttggactaat tttcaaacta cagacacttt ctaatcatga tgcatttcaa aagtggactc 420 
gaattaactg agttgcaaaa catgacagtg cccgaggatg ataacattag caatgactcc 480 
aatgatttca ccgaagtaga aaatggtcag ataaatagca agtttatttc tgatcgtgaa 540 
agtagaagaa gtctcacaaa cagccatttg gaaaaaaaga agtgtgatga gtatattcca 600 
ggtacaacct ccttaggcat gtctgttttt aacctaagca acgccattat gggcagtggg 660 
attttgggac tcgcctttgc cctggcaaac actggaatcc tactttttct ggtacttttg 720 
acttcagtga cattgctgtc tatatattca ataaacctcc tattgatctg ttcaaaagaa 780 
acaggctgca tggtgtatga aaagctgggg gaacaagtct ttggcaccac agggaagttc 840 
gtaatctttg gagccacctc tctacagaac actggagcaa tgcfcgagcta cctcttcatc 900 
gtaaaaaatg aactaccctc tgccataaag tttctaatgg gaaaggaaga gacattttca 960 
gcctggtacg tggatggccg cgttctggtg gtgatagtta cctttggcat aattctccct 1020 
ctgtgtctct tgaagaactt agggtatctt ggctatacta gtggattttc cttgagctgt 1080 
atggtttttt tcctaattgt ggttatttac aagaaatttc aaattccctg cattgttcca 1140 
gagctaaatt caacaataag tgctaattca acaaatgctg acacgtgtac gccaaaatat 1200 
gttaccttca attcaaagac cgtgtatgct ttacccacca ttgcatttgc atttgtttgc 1260 
cacccgtcag tcctgccaat ttacagtgag cttaaagacc gatcacagaa aaaaatgcag 1320 
atggtttcaa acatctcctt tttcgccatg tttgttatgt acttcttgac tgccattttt 1380 



gaaggagaac acgctgcttc 
tttcacttgt ccactagaag 
gacagtctac tatcctcagg 
aacatccgtg acacctggac 
aaagtcactt tttagaggct 
atactttgca tgttactcca 
caatattgtg catattttct 
tcctatatgg atggttaaaa 
gatgaataca ctccagtgtg 
tagaggatta actgcctcgt 
tgaaagttta aagaagtatc 
gaaaaattcc acaagttttt 
ctcctgcatt gcttatccac 
gtacaagtct tttgtccaga 
ttatagagga ctgtttgccc 
tacttatgag ttaattgtgt 
gtgctctaga agaataaaac 
gaaagtttga gactgaaaca 
catttccttt tggattcatg 
tataactttt tttttaactt 
ctatttaatt taagtataaa 



acctcttcgc cggcgggtgt 60 
tcattaagac acggttgcag 120 
ttcatctggg gaccattagt 180 
tctttcaggt tctgaagtcg 240 
tgggtccaaa tttggttgga 300 
aagccaaaga gcaatttaat 360 
cagctggctc tgcagctttt 420 
cccgaatgca gctagaacag 480 
ctcgttacgt ttaccagacc 540 
atgctggaat ttccgaaact 600 
tgaaagaagc tccattagcc 660 
ttggacttat ggcagctgct 720 
acgaagtcat aaggacgagg 780 
cggcgcgcct ggtgttccgg 840 
agcttatccg gcagatccca 900 
acctgttaga agaccgtact 960 
tgaaaaactc tagagaattt 1020 
ggaaaggcca taaaatatct 1080 
ctttctggaa ggtttaaatt 1140 
aagaggattc agggttaagc 1200 
aaaaa 1245 
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ggctacttga cattctatga caacgtgcag tccgacctcc ttcacaaata tcagagtaaa 1440 
gatgacattc tcatcctgac agtgcggctg gctgtcattg ttgctgtgat cctcacagtg 1500 
ccggtgttat ttttcacggt tcgttcatct ttatttgaac tggctaagaa aacaaagttt 1560 
aatttatgtc gtcataccgt ggttacctgc atactcttgg ttgttatcaa cttgttggtg 1620 
atcttcatac cctccatgaa ggatattttt ggagtcgtag gagttacatc tgctaacatg 1680 
cttattttca ttcttccttc atctctttat ttaaaaatca cagaccagga tggagataaa 1740 
ggaactcaaa gaatttgggc tgcccttttc ttgggcctgg gggtgttgtt ctccttggtc 1800 
agcattccct tggtcatcta tgactgggcc tgctcatcga gtagtgacga aggccactga 1860 
aacccgccga gaaaaagaaa catccctgtt gtctgctcag tcaagtcccc acacatcagc 1920 
aatctctcac cacttctttt gcaagtttac agaagcaaac agaaatgtac aggatactta 1980 
aaatggaata actttttggt tgcaaaacag agacatggtt ctataatgct tcatgtccct 2040 
ccaagatttg agatcaattt agggattgtg aaattttttt tttcaaattt catacaatca 2100 
tatttcccag tacttttcac aatcattttt tacccatcta actctatgtt ttgtggcttc 2160 
ccggtctctt agaactttga aaacatgata tacaataatg tttatttatt atacatccag 2220 
attctgaaat aattttccta ctgatgttca gctcacacta tctgtacctt tttagaagag 2280 
aaaagaatct tgaattgtat atatttattt tgctttacag aaaaaaatgg tttcgtaaat 2340 
aatttgccta ttttggttaa catagcacat ggagataatc atctgaaagt tatagggcac 2400 
tgccactgct gaatcagagc atgcccaata tttgaggtgg ctctgatttc ctggcagctg 2460 
aactcgggta gtccagtggc ctagctggta ccacatctat tcccatccag agacattctc 2520 
tggcaagtgt tctcagctga aaagtggttg gggatgattc ttaccttggt aattaaatga 2580 
agctacacat ttgggtaatc tagcaaatga agtatttttt ccctcttggc aacttgtgtc 2640 
agagttactc tggtctgagt caactttcg'c tggggaaaac ctatggaacc tactgcaaaa 2700 
agattgtcca aaatgcctaa gaaaatactc ctctgatgca tttagccttc aaccctacct 2760 
gtcttgctga agggagaaaa atgttttagt acattatagg cccagcagct tttattcatg 2820 
tccaccagct agttgcacag agaatcatgt gtacctaact aaggatgatc taggataagt 2880 
aactcctgtt ttatattgag tattttaggg aagtctttaa aagacttgtt ttatatctat 2940 
aaatctaggt tattacaaat acaagaattt tgtaccttaa ataagcctca tttctatttc 3000 
ttcttcatta attctccafcc tagtcttgtg aaaaaaaaaa aaaaaaaacc ctcagagata 3060 
gtctttgtga agagcttctg acagaatcac tgagtacctt ccttccccca gatgaggaag 3120 
acaagggggt ctcagtgtct gtgctgtctc ctcttctctt ccccaaccaa ggactgtgcc 3180 
attactgccc gtctcaactg tccatgcagg aggacagagt tgcctggtac tcttaccctt 3240 
gtccctctcc taaagggagc acaaggaaac tgaagagact gaaaaagaag agagtttgta 3300 
gctgaaaaag aatagggata gcaaggaaac ccagaactgc attcccctaa gtggggccat 3360 
cccatgtgat tgaattgtcc atagcttgcc tatggtgaga aatgtgcatg ctccgtgagc 3420 
tggtctcttg aaacaggact tatgcttcct ctatattctg gttaaatttt ccaaacacat 3480 
aagttcactg agcacagatt tcttatccag agacaagtag aatctaaccg cagactgttg 3540 
gcagagtttc caggcactta gccatgttcc cttcctgact caaatcccca aaggccttca 3600 
ctctcactga gaatcacact actgtcccat agataaggca ggcattgaag cacctgtcgt 3660 
gatcctctag gggggagaat gaaaggttat ttcctgcatt gcatcatcat agcttttaat 3720 
ataatgctac agaatcatat ccacattagg ttagagttca gatatttgga tatgaatacc 3780 
taacctagcc atatccatgg ccatctctgt tcttttcagc aatgttttcc atattatatt 3840 
agcaatgaca gaaacagaac aagccaagat ccagtcagtt cttgggagct tgtctagagc 3900 
accaagtaat gaaatagcca ggtagtggga tgactgtacc tttaaaaata cataatttag 3960 
tttgcaagct atattatgct actttctatt ttcctcgtta ctttatagca attcatttta 4020 
ccctcacaaa gtcaatttag aaccttatca ttaactggga tgtgtagtga^tatttttggg 4080 
cctctgggtt tcatgtgtca ataccaggca tatctctttc aaatagattt atttagaggg 4140 
ggccagtgtt gttgactgtg tggaacccc 4169 

<210> 33 

<211> 3440 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 4027693CB1 

<400> 33 

gatccccacc acaccaccag cccggccgca cggggcactg agccgggtgc tgagcaccgg 60 
aggccccgcc gaggccggga ctcaggacct gcagagaaac gcctcctgat tttgtcttac 120 
aatggaactt aaaaagtcgc ctgacggtgg atggggctgg gtgattgtgt ttgtctcctt 180 
ccttactcag tttttgtgtt acggatcccc a'ctagctgtt ggagtcctgt acatagaatg 240 
gctggatgcc tttggtgaag gaaaaggaaa aacagcctgg gttggatccc tggcaagtgg 300 
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agttggcttg cttgcaagtc ctgtctgcag tctctgtgtc tcatcttttg gagcaagacc 360 
tgtcacaatc ttcagtggct tcatggtggc tggaggcctg atgttgagca gttttgctcc 420 
caatatctac tttctgtttt tttcctatgg cattgttgta ggtcttggat gtggtttatt 480 
atacactgca acagtgacca ttacgtgcca gtattttgac gatcgccgag gcctagcgct 540 
tggcctgatt . tcaacaggtt caagcgttgg ccttttcata tatgctgctc tgcagaggat 600 
gctggttgag ttctatggac tggatggatg cttgctgatt gtgggtgctt tagctttaaa 660 
tatattagcc tgtggcagtc tgatgagacc cctccaatct tctgattgtc ctttgcctaa 720 
aaaaatagct ccagaagatc taccagataa atactccatt tacaatgaaa aaggaaagaa 780 
tctggaagaa aacataaaca ttcttgacaa gagctacagt agtgaggaaa aatgcaggat 840 
cacgttagcc aatggtgact ggaaacaaga cagcctactt cataaaaacc ccacagtgac 900 
acacacaaaa gagcctgaaa cgtacaaaaa gaaagttgca gaacagacat atttttgcaa 960 
acagcttgcc aagaggaagt ggcagttata taaaaactac tgtggtgaaa ctgtggctct 1020 
ttttaaaaac aaagtatttt cagccctttt cattgctatc ttactctttg acatcggagg 1080 
gtttccacct tcattactta tggaagatgt agcaagaagt tcaaacgtga aagaagaaga 1140 
gtttattatg ccacttattt ccattatagg cattatgaca gcagttggta aactgctttt 1200 
agggatactg gctgacttca agtggattaa taccttgtat ctttatgttg ctaccttaat 1260 
catcatgggc ctagccttgt gtgcaattcc atttgccaaa agctatgtca cattggcgtt 1320 
gctttctggg atcctagggt ttcttactgg taattggtcc atctttccat atgtgaccac 1380 
gaagactgtg ggaattgaaa aattagccca tgcctatggg atattaatgt tctttgctgg 1440 
acttggaaat agcctaggac cacccatcgt tggttggttt tatgactgga cccagaccta 1500 
tgatattgca ttttatttta gtggcttctg cgtcctgctg ggaggtttta ttctgctgct 1560 
ggcagccttg ccctcttggg atacatgcaa caagcaactc cccaagccag ctccaacaac 1620 
tttcttgtac aaagttgcct ctaatgttta gaagaatatt ggaagacact atttttgcta 1680 
ttttatacca tatagcaacg atattttaac agattctcaa gcaaattttc tagagtcaag 1740 
actattttct catagcaaaa tttcacaatg actgactctg aatgaattat ttttttttat 1800 
atatcctatt ttttatgtag tgtatccgta gcctctatct cgtatttttt tctatttctc 1860 
ctccccacac catcaatggg actattctgt tttgctgtta ttcactagtt cttaacattg 1920 
taaaaagttt gaccagcctc agaaggcttt ctctgtgtaa agaagtataa tttctctgcc 1980 
gactccattt aatccactgc aaggcaccta gagagactgc tcctatttta aaagtgatgc 2040 
aagcatcatg ataagatatg tgtgaagccc actaggaaat aaatcattct cttctctatg 2100 
tttgacttgc tagtaaacag aagacttcaa gccagccagg aaattaaagt ggcgactaaa 2160 
acagccttaa gaattgcagt ggagcaaatt ggtcattttt taaaaaaata tattttaacc 2220 
tacagtcacc agttttcatt attctattta cctcactgaa gtactcgcat gttgtttggt 2280 
acccactgag caactgtttc agttcctaag gtatttgctg agatgtgggt gaactccaaa 2340 
tggagaagta gtcactgtag actttcttca tggttgacca ctccaacctt gctcactttt 2400 
gcttcttggc catccactca gctgatgttt cctgggaagt gctaatttta cctgtttcca 2460 
aattggaaac acatttctca atcattccgt tctggcaaat gggaaacatc catttgcttt 2520 
gggcacagtg gggatgggct gcaagttctt gcatatcctc ccagtgaagc atttatttgc 2580 
tactatcaga ttttaccact atcaaatata attcaagggc agaattaaac gtgagtgtgt 2640 
gtgtgtgtgt gtgtgtgtgt gtgtgctatg catgctctaa gtctgcatgg gatatgggaa 2700 
tggaaaaggg caataagaaa ttaataccct tatgcagttg catttaacct taagaaaaat 2760 
gtccttggga taaactccaa tgtttaatac attgattttt tttctaaaga aatgggtttt 2820 
aaactttggt atgcatcaga attccctata gatctttttg aaaatatagg tacctgggta 2880 
tcacacatag aacttttaat tctgctggtg taggctgttg cccaaacatc tataatttta 2940 
ctgagctctt caagtgattc tgataacaca gcctggattg agaattttta taagattggc 3000 
aatggaaaaa catttattct tttaaataat aattttttta aaacccaaga ggtcagggga 3060 
ttttataaac caatagccaa gtgttcttta aataggaggc acccttccca ttgtgccaaa 3120 
atcatctttt catttatttt gaaatttgta tgattatttt atacttgtat gttgcctttc 3180 
ttcgaaggcg cctgaagcac tttataaaca caaatcctca caatacctct gtgaggtagg 3240 
taaatagtac ttttctatgt agtaaacctg gaatatggag aatttcataa cagttcattc 3300 
tacttaataa tgcaataatg gagctccaag ttgtcttgga cttctacacc acactcagac 3360 
ttctggaaag ttttctgtac ctcattcttt agtccctgtc aaggttagta aataaaataa 3420 
gtgacataaa aaaaaaaaaa 3440 

<210> 34 

<211> 3699 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7472030CB1 
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<400> 34 

atggtttatt ccggaaatgc agagatgttt 
gcagaagaac agccaaaact gagaaaggaa 
gctgatggac tggacatcac actcatgatc 
gcctgccttc ctttaatgcc actgtgtata 
tgtctagtcc acactaacac aacaaattat 
aatgaagata tgactctgtt gaccctgtat 
tttggttaca tacagatttc cttgtggatt 
cgaaaacagt tttttcattc agttttggca 
atcggtgaac ttaacactcg catgacagat 
gataagattg ctctgttgtt tcaaaacatg 
ttggtgaagg gctggaaact caccctagtg 
tcagcggcag catgttctag gatggtcatc 
tccaaagctg gggctgtggc agaagaagtc 
agggcccagg agaaagaact tcaaaggtat 
ggcataaaaa ggactatagc ttcaaaagtg 
ggaacctatg gacttgcttt ttggtatgga 
tataccatcg ggactgttct tgctgttttc 
ggagcagcag tccctcactt tgaaaccttc 
ttccaggtta ttgataagaa acccagtata 
gaatccatag aaggaactgt ggaatttaaa 
tctatcaaga ttctgaaagg- tctgaatctc 
gtcggtctca atggcagtgg gaagagtacg 
ccggatgatg gctttatcat ggtggatgag 
tatcgagacc atattggagt ggttagtcaa 
aacaatatca agtatggacg agatgatgtg 
gaagcaaatg cgtatgattt tatcatggag 
gaaaaaggag ctcaaatgag tggagggcag 
gttcgaaacc ccaagattct gattttagat 
aagtcagctg ttcaagctgc actggagaag 
gcacaccgac tttctactat tcgaagtgca 
ctggcggaga aaggagcgca tgctgaacta 
gtgatgtcac aggatattaa aaaagctgat 
gaaagaaaga ccaactcact tcctctgcac 
gacaaggctg aggaatccac ccaatctaaa 
aaaattttaa agttaaacaa gcctgaatgg 
gttctaaatg gaactgttca tccagtattt 
tttggaaata atgataaaac cacattaaag 
gtcattttgg gtgttatttg ctttgtcagt 
gaaaaggaaa acagcacagg aggcttgaca 
caaggagcaa caggttccag gattggcgtc 
tcagttatca tttcctttat atatggatgg 
ccagtacttg ccgtgacagg aatgattgaa 
gataagcaag aacttaagca tgctggaaag 
actatagtgt cattaacaag ggaaaaagcc 
actcaacaca gaaatacctc gaagaaagca 
catgccttta tatattttgc ctatgcggca 
gctggacgaa tgaccccaga gggcatgttc 
atggccatcg gagaaacgct cgttttggct 
gcgcatctgt ttgccttgtt ggaaaagaaa 
aaaaagccag acacatgtga agggaattta 
tgtcgcccag atgttttcat cctccgtggc 
gtagcatttg tggggagcag cggctgtggg 
ctttatgacc ccgtgcaagg acaagtgctg 
gtacagtggc tccgttccca aatagcaatc 
agcattgctg agaacatcgc ctatggtgac 
aaagaagccg caaatgcagc aaatatccat 
aacacacaag ttggactgaa aggagcacag 
attgcaaggg ctcttctcca aaaacccaaa 
ctcgataatg acagtgagaa ggtggttcag 
acatgcctag tggtcactca caggctctct 
ctgcacaatg gaaagataaa ggaacaagga 
atatatttta agttagtgaa tgcacagtca 



aacattcaaa aatcaactgc tctaataact 60 
gcagttggat ctattgagat attccgcttt 120 
ctgggtatac tgacatcact gttcaatgga 180 
ggagaaatga gtgataacct* tattagtgga 240 
cagaactgta ctcagtctca agagaagctg 300 
tatgttggaa taggtgttgc tgccttgatt 360 
ataactgcag cacgacagac caagaggatt 420 
caggacatcg gctggtttga tagctgtgac 480 
gacattgaca aaatcagtga tggtattgga 540 
tctacttttt cgattggcct ggcagttggt 600 
actctatcca cgtctcctct tataatggct 660 
tcattgacca gtaaggaatt aagtgcctat 720 
ttgtcatcaa tccgaacagt catagccttt 780 
acacagaatc tcaaagatgc aaaggatttt 840 
tctcttggtg ctgtgtactt ctttatgaat 900 
acctccttga ttcttaatgg agaacctgga 960 
tttagtgtaa tccatagtag ttattgcatt 1020 
gcaatagccc gaggagctgc ctttcatatt 1080 
ggtaactttt ccacagctgg atataaacct 1140 
aatgtttctt tcaattatcc atcaaggcca 1200 
ggaattaagt -ctggagagac agtcgccttg 1260 
gtagtccagc ttctgcagag gttatatgat 1320 
aatgacatca gagctttaaa tgtgcggcat 1380 
gagcctgttt tgttcgggac caccatcagt 1440 
actgatgaag agatggagag agcagcaagg 1500 
tttcctaata aatttaatac attggtaggg 1560 
aaacagagga tcgcaattgc tcgtgcctta 1620 
gaggctacgt ctgccctgga ttcagaaagc 1680 
gcgagcaaag gtcggactac aatcgtggta 1740 
gatttgattg tgaccctaaa ggatggaatg 1800 
atggcaaaac gaggtctata ttattcactt 1860 
gaacagatgg agtcaatgac atattc tact 1920 
tctgtgaaga gcatcaagtc agacttcatt 1980 
gagataagtc ttcctgaagt ctctctatta 2040 
ccttttgtgg ttctggggac attggcttct 2100 
tccatcatct ttgcaaaaat tataaccatg 2160 
catgatgcag aaatttattc catgatattc 2220 
tatttcatgc aggatattgc ctggtttgat 2280 
acaatattag ccatagatat agcacaaatt 2340 
ttaacacaaa atgcaactaa catgggactt 2400 
gagatgacat tcctgattct gagtattgct 2460 
accgcagcaa tgactggatt tgccaacaaa 2520 
atagcaactg aagctttgga gaatatacgt 2580 
ttcgagcaaa tgtatgaaga gatgcttcag 2640 
cagattattg gaagctgtta tgcattcagc 2700 
gggtttcgat ttggagccta tttaattcaa 2760 
atagttttta ctgcaattgc atatggagct 2820 
cctgaatatt ccaaagccaa atcgggggct 2880 
ccaaatatag acagccgcag tcaagaaggg 2940 
gagtttcgag aagtctcttt cttctatcca 3000 
ttatccctca gtattgagcg aggaaagaca 3060 
aaaagcactt ctgttcaact tctgcagaga 3120 
tttgatggtg tggatgcaaa agaattgaat 3180 
gttcctcaag agcctgtgct cttcaactgc 3240 
aacagccgtg tggtgccatt agatgagatc 3300 
tcttttattg aaggtctccc tgagaaatac 3360 
ctttctggcg gccagaaaca aagactagct 3420 
attttattgt tggatgaggc cacttcagcc 3480 
catgcccttg ataaagccag gacgggaagg 3540 
gcaattcaga acgcagattt gatagtggtt 3600 
actcatcaag agctcctgag aaatcgagac 3660 
gtgcagtga 3699 
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<210> 35 

<211> 2428 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 7476089CB1 

<400> 35 

atagatcctg aaaaggaaac gactgatatc accatcaagc tagtgatcat ccatatggct 60 
tgctgcagtt ctccacaggg ctgcctcctc agcctaagga cgcatgaccc tgcctgcaag 120 
cagcgttcca catatcactg tggaggagga agatggagaa atcaggttat ggtcatccgt 180 
gcacacggga cttctgggaa gggtgactgc ggaatttaga acagtgtcct tgacagcatt 240 
cagtcctgag gattaccaga atgttgctgg cacattagaa tttcaaccag gagaaagata 300 
taaatacatt ttcataaaca tcactgataa ttctattcct gaactggaaa aatcttttaa 360 
agttgagttg ttaaacttgg aaggaggagc cagtctagga gtggcttccc aaattctagt 420 
gacaattgca gcctctgacc acgctcatgg cgtatttgaa tttagccctg agtcactctt 480 
tgtcagtgga actgaaccag aagatgggta tagcactgtt acattaaatg ttataagaca 540 
tcatggaact ctgtctccag tgactttgca ttggaacata gactctgatc ctgatggtga 600 
tctcgccttc acctctggca acatcacatt tgagattggg cagacgagcg ccaatatcac 660 
tgtggagata ttgccEgacg* aagacccaga actggataag gcattctctg~tgtcagtcct 720 
cagtgtttcc agtggttctt tgggagctca tattaatgcc acgttaacag ttttggctag 780 
tgatgatcca tatgggatat tcattttttc tgagaaaaac agacctgtta aagttgagga 840 
agcaacccag aacatcacac tatcaataat aaggttgaaa ggcctcatgg gaaaagtcct 900 
tgtctcatat gcaacactag atgatatgga aaaaccacct tattttccac ctaatttagc 960 
gagagcaact caaggaagag actatatacc agcttctgga tttgctcttt ttggagctaa 1020 
tcagagtgag gcaacaatag ctatttcaat tttggatgat gatgagccag aaaggtccga 1080 
atctgtcttt atcgaactac tcaactctac tttagtagcg aaagtacaga gtcgttcaat 1140 
tccaaattct ccacgtcttg ggcctaaggt agaaactatt gcgcaactaa ttatcattgc 1200 
caatgatgat gcatttggaa ctcttcagct ctcagcacca attgtccgag tggcagaaaa 1260 
tcatgttgga cccattatca atgtgactag aacaggagga gcatttgcag atgtctctgt 1320 
gaagtttaaa gctgtgccaa taactgcaat agctggtgaa gattatagta tagcttcatc 1380 
agatgtggtc ttgctagaag gggaaaccag taaagccgtg ccaatatatg tcattaatga 1440 
tatctatcct gaactggaag aatcttttct tgtgcaactg atgaatgaaa caacaggagg 1500 
agccagacta ggggctttaa cagaggcagt cattattatt gaggcctctg atgaccccta 1560 
tggattattt ggttttcaga ttactaaact tattgtagag gaacctgagt ttaactcagt 1620 
gaaggtaaac ctgccaataa ttcgaaattc tgggacactc ggcaatgtta ctgttcagtg 1680 
ggttgccacc attaatggac agcttgctac tggcgacctg cgagttgtct caggtaatgt 1740 
gacctttgcc cctggggaaa ccattcaaac cttgttgtta gaggtcctgg ctgacgacgt 1800 
tccggagatt gaagaggtta tccaagtgca actaactgat gcctctggtg gaggtactat 1860 
tgggttagat cgaattgcaa atattattat tcctgccaat gatgatcctt atggtacagt 1920 
agcctttgct cagatggttt atcgtgttca agagcctctg gaaagaagtt cctgtgctaa 1980 
tataactgtc aggcgaagcg gagggcactt tggtcggctg ttgttgttct acagtacttc 2040 
cgacattgat gtagtggctc tggcaatgga agaaggtcaa gatttactgt cctactatga 2100 
atctccaatt caaggggtgc ctgacccact ttggagaact tggatgaatg tctctgccgt 2160 
gggggagccc ctgtatacct gtgccacttt gtgccttaag gaacaagctt gctcagcgtt 2220 
ttcatttttc agtgcttctg agggtcccca gtgtttctgg atgacatcat ggatcagccc 2280 
agctgtcaac aattcagact tctggaccta caggaaaaac atgaccaggg tagcatctct 2340 
tttagtggtc aggctgtggc tgggagtgac tatgagcctg tgacaaggca atgggccata 2400 
atgcaggaag gtgatgaatt cgcaaaaa 2428 

<210> 36 

<211> 2243 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 6428177CB1 

<400> 36 

gtaactccag gacgagaccg gagcgacccg cgcagagcat aggcggcgaa ctgcgcccgg 60 
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cgcccgagac cggcagctgc gtggggcggg ggctgcgccc gagcccgatc tgccggctcc 120 
gagtggtctc ggaaagaggg tcgtggtccc gcacggatgc gcttgttggg agaaaccttg 180 
gagattcacg gcaaggcgta aagcctgggg cttccaacga tactctgggc agggatggaa 240 
gcctagatgc ctcaccgcaa ggagcggccg agcgggtcct cgcttcacac acacggcagc 300 
accggcaccg cggagggagg aaacatgtcc cggctgtctc tcacccggtc gcctgtgtct 360 
cccctggctg cccagggcat ccccctgcca gcccagctca ccaagtccaa tgcacctgtg 420 
cacatcgatg tgggcggcca catgtacacc agcagcctgg ccacgctcac caagtaccct 480 
gactccagga taagccgcct cttcaatggc actgaaccca tcgtcctgga cagtttgaag 540 
caacattatt tcattgaccg ggatggggag attttccgct acgtcctgag cttcctgcgg 600 
acgtccaagc tgctgcttcc ggatgacttt aaggacttca gtctgctgta cgaggaggcg 660 
cgctactatc agctccagcc catggtgcgc gagctggagc gctggcagca ggagcaggag 720 
cagcggcgcc gcagccgggc ctgtgactgc ctggtggtgc gcgtcacgcc cgacttgggc 780 
gagcggatcg cactcagcgg cgagaaggcc ctcatcgagg aggtcttccc cgagaccgga 840 
gacgtcatgt gcaactccgt caacgccggc tggaaccagg accccacgca cgtcatccgc 900 
ttcccgctca atggctactg ccggctcaac tcggtacagg tcctggagcg gctgttccag 960 
aggggtttca gcgtggctgc gtcctgtggg ggcggtgtgg actcctccca gttcagcgag 1020 
tatgtgcttt gccgggagga gcggcggccg cagcccaccc ccactgctgt tcgaatcaag 1080 
caggaacccc tggactaggc cctgcttcag tgcccacctg ggccccccca gggacctgga 1140 
aacagtgctg gggagttctg cctgtgtata cttggccgtg ggcatgagac cgagggtgag 1200 
gctggagggt ccaaagctgg cccagcgagc accagggtcc caggtgtcat ggcaacagaa 1260 
cgtgggatgc tggaggcatg cctgcagaag gactgttgat gcgacccaaa gatacagcgg 1320 
tgggatctct gctgccagct ctcccagccc ctcagcttcg cagcctggcg cagcatcctc" 1380 
tgaggccccg gggcctgttg gggcggggtt ggaagagccg tctgcagcta cttcagagga 1440 
gctgtttatc cctctccacg cggggcagac tctggcgggt ctcctagcgt ccgagagatg 1500 
gcttattttc tacagtattt aaaatggatg cagccctaac tgcaaaagtc agagaggctg 1560 
acaaggacca atgcttcttt atctggtgct cagttctcag tcagacgtgc agcatggctg 1620 
cagggtggac cagctgcctg gcattcaggc ccagatgcct gcagggctgg ggctctcggg 1680 
acagatgcag ggatgtgtgc tgcagggctg ctgggaggag agtggtgggg gcctgagggc 1740 
tgagtgattc tgtaaccacc tgagaccttc acgtttgctg ccgttggggg ctcaggctgc 1800 
actccccggg tcacctgacc tgctgcccag gggcttccag tcctgtctgt gtggactggc 1860 
acctgggctg ctggagaagt ctcctcccgt tcggaccagc ctcagggctg cacgttacct 1920 
caggaatggg ccccaccatg aaggggccca tctgtcagca gcgtcttcta ggtccccagc 1980 
tcagggagcc atccccagct ccagttttct catgcgaata tgcacagttt taattcacgt 2040 
tgttacacta gcctgccgat gagacccaga cacaggcaga cctggcgctc ttgacccctg 2100 
attccagtga ggactggccc tgaggagtcc ttgcagacct gctgcctgcc ccacgacagg 2160 
cccaaagatg gaccccccct ggccttgtga cagctcccca agtgttctcc ggtggagaaa 2220 
ctgcagagga ctggtgggcg ggg 2243 

<210> 37 

<211> 3711 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_JE eature 

<223> Incyte ID No: 7477243CB1 

<400> 37 

gagcgggtgg cccggccgcc 
gtgtgttttg tggacggcgc 
agccttcgac gcgatgttcc 
agttggcaca cgcacagtgt 
tgcacaaaga ttttgtgata 
cccaaagaat ctgtttgaac 
ccttgtacag gtcacagtag 
ctttgttata actgttacag 
tgacaatgaa gtcaacaaaa 
agaaagtgaa aaaatcaagg 
ctgtgatctt attcttctat 
cagtcttgat ggggaatcca 
gtgtacagca gaatccatcg 
tgacctctac aaatttgttg 
caggtctttg ggacctgaaa 



cgcctcgctg ctccgcttgg cgccgccggc ccacgccgca 60 
cttcccagac agcccggtag agcccagctc agcgcccggc 120 
gccggagctt gaatcgtttt tgtgctggag aagagaaacg 180 
ttgttggcaa tcatccagtt tcggaaacag aagcttacat 240 
atagaatagt ctcatctaag tatacacttt ggaattttct 300 
agtttagaag aattgcaaat ttttattttc tcataatctt 360 
acacaccaac tagcccagtt accagtggac ttccactttt 420 
ccatcaagca gggatatgag gattgtctga gacacagagc 480 
gcactgttta cattattgaa aatgcaaagc gagtgagaaa 540 
ttggtgatgt agtagaagta caggcagatg aaacctttcc 600 
catcttgcac cactgatgga acctgttatg tcactacagc 660 
attgcaagac acattatgca gtacgtgata ccattgcact 720 
ataccctccg agcagcaatt gaatgtgaac agcctcaacc 780 
ggcgaatcaa tatctacagt aatagtcttg aggctgttgc 840 
atctcttgct gaaaggagct acgctaaaaa ataccgagaa 900 
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gatatatgga gttgctgttt acactggaat ggaaaccaaa atggctttga actaccaagg 960 
gaaatctcag aaacgttctg ctgttgaaaa atctattaat gctttcctga ttgtatattt 1020 
atttatctta ctgaccaaag ctgcagtatg cactactcta aagtatgttt ggcaaagtac 1080 
cccatacaat gatgaacctt ggtataacca aaagactcag aaagagcgag agaccttgaa 1140 
ggttttaaaa atgttcaccg acttcctatc atttatggtt ctattcaact ttatcattcc 1200 
tgtctccatg tacgtcacag tagaaatgca gaaattcttg ggctccttct tcatctcatg 1260 
ggataaggac ttttatgatg aagaaattaa tgaaggagcc ctggttaaca catcagacct 1320 
taatgaagaa cttggtcagg tggattatgt atttacagat aagactggaa cactcactga 1380 
aaacagcatg gaattcattg aatgctgcat agatggccac aaatataaag gtgtaactca 1440 
agaggttgat ggattatctc aaactgatgg aactttaaca tattttgaca aagtagataa 1500 
gaatcgagaa gagctgtttc tacgtgcctt gtgtttatgt catactgtag aaatcaaaac 1560 
aaacgatgct gttgatggag ctacagaatc agctgaatta acctatatct cctcttcacc 1620 
agatgaaata gctttggtga aaggagctaa aaggtacggg ttcacatttt taggaaatcg 1680 
aaatggatat atgagagtag agaaccaaag aaaagaaata gaagaatatg aacttcttca 1740 
caccttaaac tttgatgctg tccggcgacg tatgagtgta attgtgaaga ctcaagaagg 1800 
agacatactt ctcttttgta aaggagcaga ctcggcagtt tttcccagag tgcaaaatca 1860 
tgaaattgag ttaactaaag tccatgtgga acgtaatgca atggatgggt atcggacact 1920 
ctgtgtagcc ttcaaagaaa ttgctccaga tgattatgaa agaattaaca gacagctcat 1980 
agaggcaaaa atggccttac aagacagaga agaaaaaatg gaaaaagttt tcgatgatat 2040 
tgagacaaac atgaatttaa ttggagccac tgcagttgaa gacaagctac aagatcaagc 2100 
tgcagagacc attgaagctc tgcatgcagc aggcctgaaa gtctgggtgc tcactgggga 2160 
caagatggag acagctaaat ccacatgcta tgcctgccgc cttttccaga-ccaacactga 2220 
gctcttagaa ctaaccacaa aaaccattga agaaagtgaa aggaaagaag atcgattaca 2280 
tgaattattg atagaatatc gcaagaaatt gctgcatgag tttcctaaaa gtactagaag 2340 
ctttaaaaaa gcatggacag aacatcagga atatggatta atcatagatg gctccacatt 2400 
gtcactcata ctaaattcta gtcaagactc tagttcaaac aattacaaaa gcattttcct 2460 
acaaatatgt atgaagtgta ctgcagtgct ctgctgtcgg atggcaccat tacagaaagc 2520 
ccagattgtc agaatggtga agaatttaaa aggcagccca ataactctgt cgataggtga 2580 
tggtgccaat gatgttagta tgatcttgga atcccatgtg ggaataggha ttaaaggcaa 2640 
agaaggtcgc caagcagcta ggaatagcga ttattctgtt ccaaagttta aacacttaaa. 2700 
gaaactgctg ttggctcatg gacatctata ttatgtgaga atagcacacc ttgtacagta 2760 
cttcttctat aagaaccttt gtttcatttt gccacagttt ttgtaccagt tcttctgtgg 2820 
attctcacaa cagccactgt atgatgctgc ttaccttaca atgtacaata tctgcttcac 2880 
atccttgccc atcctggcct atagtctact ggaacagcac atcaacattg acactctgac 2940 
ctcagatccc cgattgtata tgaaaatttc tggcaatgcc atgctacagt tgggcccctt 3000 
cttatattgg acatttctgg ctgcctttga agggacagtg ttcttctttg ggacttactt 3060 
tctttttcag actgcatccc tagaagaaaa tggaaaggta tacggaaact ggacttttgg 3120 
aaccattgtt tttacagtct tagtattcac tgtaaccctg aagcttgcct tggatacccg 3180 
attctggacg tggataaatc actttgtgat ttggggttct ttagccttct atgtattttt 3240 
ctcattcttc tggggaggaa ttatttggcc ttttctcaag caacagagaa tgtattttgt 3300 
atttgcccaa atgctgtctt ctgtatccac atggttggct ataattcttc taatatttat 3360 
cagcctgttc cctgagattc ttctgatagt attaaagaat gtaagaagaa gaagtgccag 3420 
gagaaatctg agctgtagaa gggcatctga ctcattatcc gccagacctt cagtcagacc 3480 
tcttctttta cgaacattct cagacgaatc taatgtattg taacagaatc cgaatcttga 3540 
actgcctatg ttattgtcct acaagcatac tgacagtggt tacagctaaa aaagaaagca 3600 
tgaagaaaca actacaaaaa gttatcatct caggatactt gatatgcaac acactaaacc 3660 
actctcatgt ctagaatcac aataaatttc attaattgag ggtagaggtt a 3711 

<210> 38 

<211> 2717 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc„feature 

<223> Incyte ID No: 7473042CB1 

<400> 38 

cccgccgggc tccaactccg cagcgtcgga gcgcggcggg cagcaacttt ctccccggag 60 
cggccgtggc ggcggctgct gccgtggcag ccggagcgga agccgggagg aagaaagcgg 120 
cggcagcggc ggttgctccc gccggctcgg gctgtctagc tcgccgagac tgccggcccg 180 
cggagccgcg tacccccggg cagccccggg cccctgccct atgtcccgca aggcaagcga 240 
gaatgtggag tacacgctgc ggagcctgag cagcctgatg ggcgagcggc gcaggaagca 300 
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gccggagccg gacgcggcga gcgcggccgg ggagtgcagc ctcctggctg ccgccgaatc 360 
gagcaccagc ctgcagagcg cgggcgcggg cggcggcggc gtcggggacc tggagcgcgc 420 
ggcgcggcgg cagttccagc aggacgagac ccccgccttc gtgtacgtgg tggccgtctt 480 
ctccgcgctg ggcggcttcc tgtttggcta tgacaccggg gtggtgtcag gggccatgct 540 
gctgctcaag cggcagctca gtctggacgc gctgtggcag gagctgctgg tgtccagcac 600 
ggtgggggcg gctgccgtct cggcgctggc cggaggcgcc ctcaacggcg tcttcggccg 660 
ccgcgctgcc atcctcctgg ccagtgccct cttcaccgcc ggctccgcgg tgctggctgc 720 
ggccaacaac aaggagacac tgctcgccgg ccgcctggtc gtgggactcg gcatcggcat 780 
tgcttctatg acagtgccag tgtacattgc ggaggtctca ccacccaatt taagaggccg 840 
attagtcacc attaataccc tcttcatcac aggagggcag ttctttgcaa gtgttgttga 900 
tggagccttc agttatctcc agaaggatgg atggaggtac atgttgggac ttgcagtagt 960 
tccggcggtt atacagtttt ttggctttct ctttttgcct gaaagccctc gatggcttat 1020 
tcagaaagga cagactcaga aggcccgtag aattttatct cagatgcgtg gtaaccagac 1080 
cattgatgag gaatatgata gcatcaaaaa caacattgaa gaggaggaaa aagaggttgg 1140 
ctcagctgga cctgtgatct gcagaatgct gagttatccc caaactcgcc gagctttaat 1200 
tgtgggttgt ggcctacaaa tgttccagca gctctcaggc attaacacca tcatgtacta 1260 
cagtgcaacc attctgcaga tgtctggtgt tgaagatgat agacttgcaa tatggctggc 1320 
ttcagttaca gccttcacaa atttcatttt cacacttgtg ggagtctggc ttgttgagaa 1380 
ggtgggccgc agaaagctta cctttggtag tttagcaggt accaccgtag cactcattat 1440 
tcttgccttg ggatttgtgc tatcagccca agtttcccca cgcatcactt ttaagccaat 1500 
agctccgtca ggtcagaacg ccacttgcac aagatacagt tactgtaatg aatgtatgtt 1560 
ggatccagac tgcggtttct gctacaagat gaacaaatca actgtcattg -actcctcctg 1620 
tgttccagtt aataaagcat ctacaaatga ggcagcctgg ggcaggtgtg aaaatgaaac 1680 
caagttcaaa acagaagata tattttgggc ttacaatttc tgccctactc catactcctg 1740 
gactgcactt ctgggcctta ttttatatct tgtcttcttt gcacctggaa tgggaccaat 1800 
gccttggact gtgaattctg aaatatatcc cctttgggca agaagtacag gaaatgcatg 1860 
ttcatctgga ataaactgga ttttcaatgt cctggtttca ctaacatttt tacacacagc 1920 
agagtatctt acatactatg gagctttctt cctctatgct ggatttgctg ctgtgggact 1980 
ccttttcatc tatggctgtc ttcctgagac caaaggcaaa aaattagagg aaattgaatc 2040 
actctttgac aacaggctat gtacatgtgg cacttcagat tctgatgaag ggagatatat 2100 
tgaatatatt cgggtaaagg gaagtaacta tcatctttct gacaatgatg .cttctgatgt 2160 
ggaataattt tcagctgctc atatatttag ttatttaaac aaactggggg gagaagaaca 2220 
gcaattggtg acttcactgc cctgcttcta atctggttct ttccacagcc tagttttgat 2280 
tgacttcata ttctagaata cttgattagg aggaagatac aaccatgatg actttttttt 2340 
tccacaagga acaatatttt aaaaaatatt tacagagatt ttaatctaat aattcttaag 2400 
caaatgtgtg taatgccttc ctgaaatagt ctaaaatgaa tattgtaccc agtgacttca 2460 
gtggtatcct tttttcctaa gaccatttat aattattagt ggcaacagag tcagtgctaa 2520 
tctagccaaa ttacatatgt ataatatatt tataaaggat tctgggagat ggtccaaggg 2580 
tgttctgtgt caaaagatgg cctattggcc ctcagttttc ctacagagta gtggcttatc 2640 
tctgatcagc tgttacaaac taaattccat gtaagctttc atcaacaaat tccaaagtgc 2700 
ctcctacaag ggcacag 2717 

<210> 39 

<211> 2235 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc„feature 

<223> Incyte ID No: 7482060CB1 

<400> 39 

agggagcgcc ggagacgggg agctattccg ccccggcggc tccattcggc gcccgcagcc 60 
ctcagggggt cggccccgcg gcttgggaga gggcaccgcg gcctcggtgt gcgcagccct 120 
cgggcgcgag ggtcggcggc gcggacacag ccgcgttccc agccggtggg gctcagcgct 180 
ggcgccggcg aggactcccc ggccacccgc aggtaccgcc gggcggaggg cgcgctacta 240 
gcagcgccgg agatactcga gcccagggac ccccgggcca gcggagggca ggagcggagc 300 
cccgagggag cgcgggcccc gacggcgcgc tcccccgtca gccacgggca ggcaggcccc 360 
gcgtggcggc ttggggtggg gggctgcagc ggggccctcg ggccgaaagt cccccgggcg 420 
gccagccatg accttcgggc gcagcggggc ggcctcggtg gtgctgaacg tgggcggcgc 480 
ccggtattcg ctgtcccggg agctgctgaa ggacttcccg ctgcgccgcg tgagccggct 540 
gcacggctgc cgctccgagc gcgacgtgct cgaggtgtgc gacgactacg accgcgagcg 600 
caacgagtac ttcttcgacc ggcactcgga ggccttcggc ttcatcctgc tctacgtgcg 660 
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cggccacggc aagctgcgct tcgcgccgcg 
gatctactgg ggcctggagg gcgcgcacct 
ccgcatgtcc gacacctaca ccttctactc 
cgaggcgcgc cccggcgcgc gaggcggctc 
ggaccttcga ggagcccaca tcctggctag 
ggtggtgctg tgcgccagca cgttgcccga 
cctggatgac cggagcagga taattgaagc 
catcgtgagg ttcattgtct ccaaaaacaa 
cattgattta ctggcaatca cgccgtatta 
cgagaactct caactccaga gggctggagt 
tttttgggtg attaagcttg cccgtcactt 
caaacgttgc taccgagaga tggttatgtt 
ctttagtgca ctttctcagc ttcttgaaca 
ctttaccagc attcctgctg cctgctggtg 
tggagatatg tatcctatca cagtgcctgg 
tggaattgtt ctattggcat tacctatcac 
tcatgagctc aagtttagat ctgctaggag 
gggcactgtg gggtatacag agatgaccat 
tccttgtacc tgcaaaaagc ccttgaagac 
tgatttgtgg cagtctctag aaggtggccc 
gacccggtgg tgcttccacc ctgccggaag 
ggttgcatcc ccaggaagca ggcccgcagc 
ccttgtcctg attgtcgcag caggccctgt 
caggggtgga tgcaaggact tctgagggcg 
gagccggggt cctgcatttc cctctggcgt 
ccatgtcccg agaagaggtt ggccacaact 
aggatgaagg atatg 



gatgtgcgag ctctccttct acaacgagat 720 
cgagtactgc tgccagcgcc gcctcgacga 780 
ggccgacgag ccgggcgtgc tgggccgcga 840 
cctccaggcg ctggctggag cgcatgcggc 900 
cgtgtcggtg gtgttcgtga tcgtgtccat 960 
ctggcgcaac gcagccgccg acaaccgcag 1020 
tatctgcata ggttggttca ctgccgagtg 1080 
gtgtgagttt gtcaagagac ccctgaacat 1140 
catctctgtg ttgatgacag tgtttacagg 1200 
caccttgagg gtacttagaa tgatgaggat 1260 
cattggtctt cagacactcg gtttgactct 1320 
acttgtcttc atttgtgttg ccatggcaat 1380 
tgggctggac ctggaaacat ccaacaagga 1440 
ggtgattatc tctatgacta cagttggcta 1500 
aagaattctt ggaggagttt gtgttgtcag 1560 
ttttatctac catagctttg tgcagtgtta 1620 
catttgccta acaagtgtca cttctgtgct 1680 
caacgggcct tgccctgacg ccctgagaga 1740 
ccattctggg gtcctttaca aggccatggc 1800 
accggtggag cagctgcccc cagacccctt 1860 
caccttgtgt ggccccgcca acagcatggc 1920 
gcccggaggg ggtttcctga ggacagaggc 1980 
cgatggactt aactgtgaaa atcacccttt 2040 
gagaagtaga taccttcctg gatagctgtg 2100 
ctctgctgac tgagatgtga agcagtcggc 2160 
ctgtgccaca tgctcttcat tttagaatcc 2220 

2235 



<210> 40 

<211> 2563 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

.<223> Incyte ID No: 1578772CB1 

<220> 

<221> unsure 
<222> 2466 

<223> a, t, c, g # or other 



<400> 40 

gagatggaag ataacaatgt tccagtaaga 
tatggtgccc caggtgtggt tcaaaataaa 
taaaattctt atttttttca cacaatgaca 
agtctccccc taatattttt ggtttccctt 
tgcttatagg gtaacccaac agtggctttt 
cttgaacaat ggacatctac aaggagactg 
gagcgggagg gttagatgtg aatggcagaa 
ttgggagaag ggagacttcc gaggaagatt 
tgtgggggtg ggaggctctc ttcctttttt 
ggcctcttct cctgcacagt ggtcctgtcg 
tatctgccca aggctgtcct ggcttgcatc 
cagatgcagg aacttccaca actatggcac 
gtcacctggg tggcagtagt gaccctgagt 
ttctccatga tgactgtggt ctgccgcacc 
tcctgagcta tccaacacca ctgtactttg 
agtggcacct ggggctcgga gaaggagaaa 
ttgcagttgc tgagcctgtc agggtggtgg 
atgctgctgg ggccagagaa gtggtgcagc 
gcctcctcct ggctcagtgt aatgccttgg 
tggacagggt gactccagat cagctgtttg 



cctttgtagg ctctaagcgc tgaaaaagtt 60 
aacacaaaac tataaaataa acttaagaga 120 
ttgtctttta aaaggaatat cacatatcaa 180 
actttgctgg tgccctaaac acatataccc 240 
aaggtggagg tgggctacaa aactggggaa 300 
gtgaataatg ggcatttaat taattggggg 360 
tattaagaag ggggttgtgt ggaaggagat 420 
aggcagagtg ggcaggaaga ccagctctca 480 
gctcctgttc ctccttttct ctagctggca 540 
gtgctgctgt ggctggggcc cttcttttac 600 
aacatctcca gcatgcgcca ggtgttctgc 660 
atcagccgag tggactttgc tgtgtggatg 720 
gtggatttgg gcctggctgt gggtgtggtc 780 
cggagctcct ccaggtcccg gggctctgca 840 
ggacccgtgg gcagtttcgc tgcaacctgg 900 
aggagacttc aaagccagat ggcccaatgg 960 
tcctagactt cagtggtgtc acctttgcag 1020 
tggccagccg atgtcgagat gctaggatcc 1080 
tgcaggggac actgacccgg gtaggactcc 1140 
tgagtgtgca ggatgcagct gcttatgccc 1200 



47/48 



WO 02/40541 



PCT/US01/46055 



tggggagcct ggtaaggggc agtagcacca 
gcaagtgagg caggggagct cactgaccca 
catgtggagt gcagagggcc ctgatgacat 
cccttaccta acgtaactaa taaaatgaag 
taggtgtttg cacagggact ctggtgcccc 
caactaagaa tggctttcac caaccaccag 
tctctgtggg ggtggggctg gagcaggtac 
ggggcacagt cttgggatta tgtgttgggg 
gaggacagtg tttccacctt aggttcttag 
gtgatccccc catttcctgc cccatgagaa 
gggagaggtt atggctccac caggcctctg 
acaaccccta ctgggcatca ggttctgaaa 
cagccagtag agtgctcaca gggtggtggg 
aggcaggccc agcctgacag tcagaaatcc 
tgcagccggt gtttgaagaa gagcagccgg 
cctgggtaac ggtaccgggc gtaagtctct 
ttcagtttgg atgcatgatc caccacgacg 
agcccatcca agaagaattc cagatgagcc 
tcgcgcacct tgtcagtccg cgccaggaag 
cctgggaagc cgacgagctc gtggtggaag 
ctggggcgcc gggctccacg ctcagcagct 
cgcgcnacgg, ccccccacca ggtccagcgg 
cgcgtccggc cgtgaagacg aatcgttcgt 



ggagcgggag ccaggaggca ctgggctgcg 1260 
aagatttgca ccgtgtgggt ctgacctcat 1320 
gtgtgtgatg aggaccatga cccttgaacc 1380 
ctgagagctt tggaatccat gaagtgagtc 1440 
ttcttttgtg cccacagcat tgcagagaca 1500 
ccctcacccc agccccagag ccacaagttc 1560 
acagagtact ggatctgaag atgcagatga 1620 
aacttcccca ccccctcggt cccaagatga 1680 
agtccctctg ggctctttgg cacttggaaa 1740 
tgggcagggg gaggacttgg cactggctgt 1800 
ggcactggaa aaaggagggg tgtcaccagg 1860 
aggaagagtg aggaactaga ggctcaggga 1920 
gtttgttgga aattcctggc agggacaagg 1980 
ccagcgggcc atcactggga ggtcatgcac 2040 
tgtttggcca tctggctctc gtccagtgat 2100 
gctccggcat cccttgatgt ccaaggcagc 2160 
tcggagcagg agccaacccg aaggtaacca 2220 
acgcggctga agcgggggtc gaaaccgacc 2280 
aagttaacca cgccgtcggt gaccacgcag 2340 
cagcgccbtt gccggaggag ttcccgaggc 2400 
gccgataagt ggtggcaaag ccggagatct 2460 
cgtccgtcca gcacgtcaca agcttctcag 2520 
caccacagca cga 2563 
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